Technical  Report  1541 


Parameter  Estimation 
in  Chaotic  Systems 


Elmer  S.  Hung 


MIT  Artificial  Intelligence  Laboratory 


©ISTPJBimON  STATEMENT;  A 

Approved  for  public  release; 
Distribution  Unlimited 


19951004  124 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OBM  No.  0704-0188 


Public  reporting  burden  for  this  collection  ol  infomiation  is  estimated  to  average  1  hour  per  response.  Including  me  time  tor  d  infoirraton  ™ 

maintaini^  thidala  needed,  and  completing  and  reviewing  the  collection  of  jnforrobom  Send  ojmrnents  regardiro  this  burden  estimate  or  atv  ?'Sf 

including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Infomabon  Citations  ^Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington, 
VA  ?^^02-4302.  and  to  the  Office  of  Management  Budget.  Paperwork  Reduction  Proiect  (0704-0188),  vvashington,  DC  20503. 

1 .  AGENCY  USE  ONLY  (Leave  Blank)  2.  REPORT  DATE  I  3.  REPORT  TYPE  AND  DATES  COVERED 

technical  report 


2.  REPORT  DATE 

April  1995 


4.  TITLE  AND  SUBTITLE 

Parameter  Estimation  in  Chaotic  Systems 


5.  FUNDING  NUMBERS 

N00014-92-J-4097 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

AITR  1541 


10.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 


6.  AUTHOR(S) 

Elmer  S.  Hung 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Massachusetts  Institute  of  Technology 
Artificial  Intelligence  Laboratory 
545  Technology  Square 
Cambridge,  Massachusetts  02139 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Office  of  Naval  Research 

Information  Systems  . 

Arlington, Virginia  22217  p . 


1 1 .  SUPPLEMENTARY  NOTES 

None 


12a.  DISTRIBUTION/AVAILABILITY  STATEMENT 


DISTRIBUTION  UNLIMITED 


13.  ABSTRACT  (Maximum  200  words) 

This  report  examines  how  to  estimate  the  parameters  of  a  chaotic  system  given  noisy  observations  of  ^e  state  behavior  of  the  systeni. 
Investigating  parameter  estimation  for  chaotic  systems  is  interesting  because  of  possible  applications  for  high  precision  measurement  and  tor  use 
in  other  signal  processing,  communication,  and  control  applications  involving  chaotic  systems.  In  this  report,  we  examine  theoretical  issues 
regarding  parameter  estimation  in  chaotic  systems  and  develop  an  efficient  algorithm  to  perform  parameter  estimation.  We  discover  two 
properties  that  are  helpful  for  performing  parameter  estimation  on  non-structurally  stable  systems.  First,  it  turns  out  that  most  data  in  a  time  series 
of  state  observations  contribute  very  little  information  about  the  underlying  parameters  of  a  system,  while  a  few  sections  of  data  may  be 
extraordinarily  sensitive  to  parameter  changes.  Second,  for  one-parameter  families  of  systems,  we  demonstrate  that  there  is  often  a  preferred 
direction  in  parameter  space  governing  how  easily  trajectories  of  one  system  can  "shadow"  trajectories  of  ne^by  systems.  This  asymmetry  of 
shadowing  behavior  in  parameter  space  is  proved  for  certain  families  of  maps  of  the  interval.  Numerical  evidence  indicates  that  similar  results 
may  be  true  for  a  wide  variety  of  other  systems.  Using  the  two  properties  cited  above,  we  devise  an  algorithm  for  performing  parameter 
estimation.  Standard  parameter  estimation  techniques  such  as  the  extended  Kalman  filter  perform  poorly  on  chaotic  systems  because  of 
divergence  problems.  The  proposed  algorithm  achieves  accuracies  several  orders  of  magnitude  better  than  the  Kalman  filter  and  h^  good 
convergence  properties  for  large  data  sets.  In  some  systems  the  algorithm  converges  at  a  rate  proportional  to  $\frac{l}{n  {2}}$  where  Sn$  is 
the  number  of  state  samples  processed.  This  is  significantly  better  than  the  $\frac{l}{\sqrt{n}}$  convergence  one  would  expect  from  nonchaotic 
oscillators  based  on  purely  stochastic  considerations. 


12b.  DISTRIBUTION  CODE 


14.  SUBJECTTERMS  .  ^ 

MIT,  Nonlinear  Dynamics,  Estimation,  Parameter  Estimation,  Signal 
Processing,  Chaos,  Kalman  Filters,  Chaotic  Time  Series 


15.  NUMBER  OF  PAGES 

184 


16.  PRICE  CODE 


17  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION  19.  SECURITY  CLASSIFICATION  20.  LIMITATION  OF 
OF  REPORT  _  OF  THIS  PAGE  OF  ABSTRACT  ABSTRACT 


UNCLASSIFIED 


540-01-280-5500 


UNCLASSIFIED 


UNCLASSIFIED 


UNCLASSIFIED 


orm  298  (Rev.  2-89) 
Prescribed  by  ANSI  Std.  239-18 
298-102 


MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY 
ARTIFICIAL  INTELLIGENCE  LABORATORY 


A. I.  Technical  Report  No.  1541 


May,  1995 


Para.iii6t6r  Estimation  in  Chaotic  Systems 


Elmer  S.  Hung 

This  publication  can  be  retrieved  by  anonymous  ftp  to  publications.ai.mit.edu. 


Abstract 

This  report  examines  how  to  estimate  the  parameters  of  a  chaotic  system  given  noisy  observations  of 
the  state  behavior  of  the  system.  Investigating  parameter  estimation  for  chaotic  systems  is  interes  i  g 
because  of  possible  applications  for  high-precision  measurement  and  for  use  in  other  signal  processing, 
communication,  and  control  applications  involving  chaotic  systems. 

In  this  report,  we  examine  theoretical  issues  regarding  parameter  estimation  in  chaotic  systems  and  deve  op 
an  efficieL  algorithm  to  perform  parameter  estimation.  We  discover  two  properties  that  are  helpful  for 
performing  parameter  estimation  on  non-structurally  stable  systems.  First,  it  turns  out  that  most  data  in 
a  time  series  of  state  observations  contribute  very  little  information  about  the  underlying  parameters  of 
a  system,  while  a  few  sections  of  data  may  be  extraordinarily  sensitive  to  parameter  changes.  Second,  tor 
one  parameter  families  of  systems,  we  demonstrate  that  there  is  often 
space  governing  how  easily  trajectories  of  one  system  can  “shadow”  trajectories 

asymmetry  of  shadowing  behavior  in  parameter  space  is  proved  for  certain  families  of  inaps  of  the  interval. 
Numericaf  evidence  indicates  that  similar  results  may  be  true  for  a  wide  variety  of  other  systems. 

Using  the  two  properties  cited  above,  we  devise  an  algorithm  for  perforining  parameter  estimation,  btan- 
dard^parameter  estimation  techniques  such  as  the  extended  Kalman  filter  perforin  poorly  on  chaotic 
systerns  because  of  divergence  problems.  The  proposed  algorithm  achieves  °T^Iome 

magnitude  better  than  the  Kalman  filter  and  has  good  convergence  properties  for  large  data  sets.  In  some 
systems  the  algorithm  converges  at  a  rate  proportional  to  ^  where  n  is  the  number  of  state  samples  pro¬ 
cessed.  This  is  significantly  better  than  the  convergence  one  would  expect  from  nonchaotic  oscillators 

bcised  on  purely  stochastic  considerations. 
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Chapter  1 
Introduction 


In  this  report  we  investigate  theoretical  limitations  and  develop  computational  methods 
for  estimating  the  parameters  of  a  chaotic  system  given  a  noisy  time  series  of  state  data 
about  the  system.  There  are  two  primary  reasons  why  we  are  interested  in  parameter 
estimation  of  chaotic  time  series.  First,  there  has  been  considerable  interest  in  recent 
years  regarding  signal  processing  and  control  applications  involving  chaotic  systems  (see 
e.g.,  [11],  [49],  [9]).  Parameter  estimation  has  traditionally  been  an  important  problem  in 
signal  processing  and  control  theory,  so  in  light  of  recent  applications  involving  chaotic 
systems,  it  is  important  to  investigate  what  happens  when  the  signals  and  systems 
involved  are  chaotic. 

Second,  it  has  been  suggested  that  parameter  estimation  in  chaotic  systems  may  have 
applications  for  high-precision  measurement.  In  particular  the  idea  is  that  if  a  system 
is  chaotic  and  displays  a  sensitive  dependence  on  initial  conditions,  then  it  can  also  be 
sensitive  to  small  changes  in  parameter  values.  Consequently,  development  of  successful 
parameter  estimation  techniques  could  make  it  possible  to  measure  the  parameters  of  a 
system  extremely  accurately  given  a  time  series  of  data  about  the  state  of  the  system. 

Our  goal  in  this  report  is  to  systematically  explore  the  feasibility  of  parameter  es¬ 
timation  in  chaotic  systems  including  a  theoretical  analysis  of  what  accuracies  we  can 
reasonably  expect  to  obtain  and  what  factors  limit  this  accuracy.  We  also  present  new 
numerical  algorithms  for  estimating  the  parameters  of  chaotic  systems  and  discuss  sim¬ 
ulations  demonstrating  the  performance  of  the  algorithms. 

It  turns  out  that  the  parameter  estimation  problem  is  especially  interesting  because 
it  is  simple  enough  that  one  can  look  carefully  at  the  underlying  dynamical  mechanisms 
that  affect  the  feasibility  and  efficiency  of  various  numerical  approaches.  This  is  in 
contrast  with  a  number  of  typical  research  problems  involving  chaotic  time  series  which 
are  broad  enough  that  heuristics  must  generally  be  relied  upon  to  attack  the  problem 
numerically.  On  the  other  hand,  the  parameter  estimation  problem  is  also  complex 
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enough  that  the  results  are  interesting,  and  in  some  cases,  quite  unexpected.  As  we  shall 
see,  a  close  examination  of  the  relationship  between  system  dynamics  and  parameter 
estimation  reveals  interesting  observations  that  greatly  aid  in  the  development  of  an 
efficient  numerical  approach. 


1.1  The  problem 

Before  proceeding  further,  we  should  be  more  explicit  about  what  is  meant  by  “pa¬ 
rameter  estimation.”  Basically,  the  idea  is  the  following:  Suppose  that  we  are  given  a 
parameterized  family  of  mappings  fp{x),  where  x  is  the  state  vector  of  the  system  and 
p  are  some  invariant  parameters  of  the  system.  We  will  assume  that  fp{x),  varies  con¬ 
tinuously  with  X  and  p.  Further,  suppose  that  we  are  given  a  sequence  of  observations,^ 
{?/„},  of  a  certain  state  orbit, {^n}^  where: 

Xn+l  ~  fpi^Xfi) 
and  Pn  =  Xn  +  Vn 

for  all  integer  n  where  represents  measurement  errors  in  the  data  stream,  {pn}-  We 
are  interested  in  how  to  estimate  the  value  of  p  given  a  stream  of  data,  {pn}-  Note  that 
we  will  concentrate  the  discrete-time  formulation,  but  the  results  apply  analogously  to 
continuous  time  systems.  For  example,  one  might  imagine  that  time  is  one  of  the  state 
variables  of  the  system,  and  that  the  p'^s  represent  samples  of  a  continuous-time  system. 

For  analytic  purposes  it  is  helpful  to  assume  that,  to  first  approximation,  the  mag¬ 
nitude  of  the  measurement  errors  are  bounded  so  that: 

|Vnl  <  e 

for  some  e  >  0.  For  purposes  of  analyzing  and  evaluating  algorithms,  it  will  also  be 
useful  later  to  think  of  Vn  as  a  random  variable  with  various  probability  densities. 


1.2  Preview  of  important  issues 

Parameter  estimation  and  shadowing 

Let  us  now  try  to  get  a  flavor  for  some  of  the  important  issues  that  govern  the 
performance  of  parameter  estimation  techniques.  First  of  all,  given  a  family  of  mappings 

^Instead  of  writing  we  will  sometimes  write  {a;„}  to  denote  an  infinite  sequence  of'states. 

^We  will  refer  to  a  sequence  of  states,  Xn^-Q,  as  an  orbit  of  the  map  /  if  Xn+\  =  /(*n)  for  all  integer 
n.  Finite  sections  of  infinite  orbits,  for  example  Xn^=oi  fo*"  some  N  >  0  may  also  be  referred  to  as 
orbits. 


9 


of  the  form,  /p,  and  a  noisy  stream  of  state  data,  {yn},  we  would  like  to  know  which  /p  s 
have  orbits  that  closely  shadow  or  follow  We  know  that  {y„}  represents  an  ^tual 
orbit  of  fp  for  some  value  of  p,  with  e  magnitude  measurement  errors  added  in.  Thus, 
if  no  orbit  of  fp  shadows  {j/n}  within  e  error  for  a  particular  parameter  value,  po,  then 
po  cannot  be  the  actual  parameter  value  of  the  system  that  is  being  observed.  On  the 
other  hand,  if  many  systems  of  the  form,  /p,  have  orbits  that  closely  shadow  {y„},  then 
it  would  be  difficult  to  tell  from  the  state  data  which  of  these  systems  is  actually  being 

observed. 

It  turns  out  that  a  significant  body  of  work  is  available  to  answer  questions  like, 
“what  types  of  systems  are  insensitive  to  small  perturbations  so  that  orbits  of  perturbed 
systems  shadow  orbits  of  the  original  system  and  vice  versa?”  However,  many  of  the 
results  in  this  direction  are  topological  in  nature;  that  is,  they  answer  questions  like 
whether  such  shadowing  orbits  must  exist  or  not  for  certain  classes  of  systems.  On  the 
other  hand,  in  order  to  evaluate  the  possibilities  for  parameter  estimation,  it  is  also 
important  to  know  more  geometrically-oriented  results  like,  “how  closely  do  shadovang 
orbits  follow  each  other  for  nearby  systems  in  parameter  space”  and  “how  long  do  orbits 
of  nearby  systems  follow  each  other  if  the  orbits  do  not  shadow  each  other  forever. 
Such  results  tend  to  be  more  difficult  to  establish  and  also  depend  more  specifically  on 
the  systems  involved. 

An  example  in  one  dimension 

Investigating  the  geometry  of  shadowing  orbits  can  yield  some  interesting  results. 
For  example,  consider  the  family  of  maps: 

/p(a:)  =  px(l  -  x)  (1-1) 

for  X  G  [0, 1]  and  p  G  [0, 4].  Henceforth  we  will  refer  to  the  family  of  maps  (1.1)  as  simply 
the  family  of  quadratic  maps. 

It  is  known  ([5])  that  for  a  non-negligible  set  of  pammeter  values,  the  quadratic 
maps  in  (1.1)  produce  chaotic  behavior  for  almost  all  initial  conditions,  meaning  that 
orbits  tend  to  explore  intervals  in  state  space,  and  nearby  orbits  experience  exponential 
local  divergence  (i.e.,  positive  Lyapunov  exponents).  Suppose  that  we  pick  po  =  3.9  and 
iterate  an  orbit,  {x„},  of  fp,  starting  with  the  initial  condition  xo  =  0.3.  Numerically, 
the  resulting  orbit  appears  to  be  chaotic  and  exhibits  the  properties  cited  above,  at  least 
for  large  numbers  of  iterates.  Now  consider  the  question:  “What  parameter  values,  p, 
produce  orbits  that  shadow  {x„}  for  many  iterations  of  /p?”  We  can  get  some  idea  of  the 
answer  to  this  question  by  simply  picking  various  values  for  p  near  3.9  and  attempting 
to  numerically  find  orbits  that  shadow  {x„}.  There  are  a  number  of  issues  (see  Chapter 
5)  about  how  to  do  this.3  However,  let  us  for  the  moment  simply  assume  that  the  results 
we  present  are  at  least  qualitatively  correct. 

^For  example,  note  that  because  we  cannot  numerically  iterate  the  orbit  {r„}  accurately  for  many 
iterations,  one  could  argue  that  the  experiment  is  dominated  by  roundoff  errors.  However,  while  our 
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Figures  1.1  and  1.2  show  the  result  of  carrying  out  the  described  numerical  experi¬ 
ment  with  po  =  3.9  and  xq  =  0.3.  For  values  of  p  close  to  po,  we  attempt  to  find  finite 
orbits  of  /p  that  closely  follow  the  /p^  orbit,  {a;„  =  f'^i^o)}n=oj  ^^r  integers  N  >  0.^  In 
order  to  measure  how  closely  maps  with  different  parameters  can  shadow  {x„}^05 
helpful  to  define  e{p,N,xo,po)  to  be  the  maximal  distance  between  the  orbit,  {a:n}^=05 
and  the  closest  shadowing  orbit,  {/p  (^o)}^o>  fv  other  words,  let: 

e(7V,p,po,  a:o)  =  ^^inf  ^max  \f^(zo)  -  /p"„(xo)|.  (1.2) 


So,  for  each  p  and  integer  >  0,  e(N,  p,  poj  a;o)  measures  how  closely  the  best  possible 
shadowing  orbit  of  fp  follows  the  orbit,  of  fp^.  For  the  purposes  of  this  particular 

experiment  let  po  =  3.9  and  Xq  =  0.3  be  constant  and  set  e{N,p)  =  e{N,p,po  =  3.9,  Xo  = 
0.3).  There  is  nothing  particularly  special  about  our  choice  of  po  =  3.9  or  xq  =  0.3.  As 
we  shall  see  later,  many  other  paramter  values  and  initial  conditions  yeild  similar  results. 

Figure  1.1  shows  the  result  of  numerically  computing  e(N,p)  with  respect  to  p  for 
three  values  of  N.  The  three  v-shaped  traces  in  the  figure  represents  a  plot  of  e(N,p) 
for  N  =  61,  N  =  250,  and  N  =  1000.  e(N,p)  is  plotted  on  the  y— axis,  while  p  —  po, 
the  difference  in  parameter  value,  p,  from  the  original  parameter  value,  po,  is  labeled  on 
the  X— axis.  Note  the  distinct  asymmetry  of  the  graph  between  values  of  p  greater  than 
and  less  than  po  =  3.9.  In  fact,  for  N  =  250  and  N  =  1000  the  graph  is  so  steep  for 
p  <  Po  that  it  looks  coincident  with  the  vertical  line  demarking  p  —  po  =  0.  It  seems  that 
at  least  in  this  case,  systems  with  parameter  values,  p,  less  than  po  do  not  shadow  the 
orbit,  {x„},  nearly  as  easily  as  those  systems  with  parameter  values  greater  than  po-  In 
some  sense,  it  seems  that  orbits  for  higher  parameter  values  are  more  flexible,  or  have  a 
greater  degree  of  freedom  than  do  orbits  for  slightly  lower  parameter  values. 

This  phenonmenon  of  asymmetrical  shadowing  may  seem  counterintuitive.  If  an 
orbit,  O(po),  of  paramteer  value  po  is  shadowed  by  an  orbit,  O(po  -f  8),  of  a  slightly 
parameter  value,  po  +  8,  then  given  the  orbit,  O{po  +  8),  of  parameter  po  +  8,  isn’t 
O{po  +  ^),  shadowed  by  the  orbit,  O(po),  of  a  lower  parameter  value,  po?  Yes,  but  as  we 
shall  see,  it  may  be  that  the  set  of  orbits  of  fp^+s  that  are  shadowed  by  an  orbit  of  fp^  is 
actually  vanishingly  small.  That  is,  if  an  orbit  of  fpo+s  is  generated  by  choosing  an  inital 
condition  at  random,  we  would  find  that  the  probability  that  that  orbit  is  shadowed  by 
an  orbit  of  po  is  zero. 

Returning  to  the  example  at  hand,  we  find  that  the  asymmetry  in  parameter  space 
is  even  more  apparent  if  we  consider  how  e{N,p)  varies  with  N.  Basically  we  want  to 

particular  numerically-generated  starting  orbit  may  not  look  like  the  actual  orbit,  {x„},  with  initial 
condition  xq  =  0.3  for  large  values  n,  we  will  later  see  that  qualitatively  the  pictures  we  present  are 
similar. 

"^Here  we  let  =  /(/”)j  so  that  the  function,  /",  refers  to  the  composition  of  /  with  itself  n 
times  (define  f°  to  be  simply  the  identity  function).  Note  that  if  Xn+i  =  /(®n)  for  all  integer  n,  then 
x„  =  /"(^o)  for  all  n. 
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keep  track  of  how  the  curves  in  figure  1.1  move  inward  toward  the  vertical  line,  p  po, 
as  N  increases.  We  can  do  this  by  fixing  a  constant,  eo,  and  keeping  track  of  which 
parameter  values,  p,  satisfy  e(iV,p)  <  eo  for  varying  values  of  N.  For  example,  for  a 
particular  value  of  eo,  suppose  that  In  is  the  maximal  interval  in  parameter  space  such 
that  po  €  In  and  e(iV,p)  <  eo  for  all  p  G  In-  We  are  interested  in  what  fraction  of  the 
interval.  In,  is  greater  than  or  less  than  the  original  parameter  value,  po-  To  keep  track 

of  this  let  In  =  In  ^  In  so  that  In  =  \Pn,Po]  and  I^  =  \po,P%]  where  Pn 
p+  >  Po.  Let  a{N)  be  the  length  of  In  and  let  b{N)  be  the  length  of  In-  Figure  1.2  shows 
graphs  of  a{N)  and  b{N)  with  respect  to  N  as  computed  numerically  for  eo  -  0.01.  Note 
that  the  scale  for  a{N)  and  b{N)  on  the  y-axis  is  logarithmic  so  that  a{N)  is  several 
orders  of  magnitude  smaller  than  b{N)  for  larger  values  of  N,  reflecting  the  asymmetry  in 
parameter  space.  Also,  we  see  that  a{N)  and  biN)  both  appear  approximately  constant 
for  large  stretches  of  N  except  where  a{N)  decreases  in  large  increments  over  a  small 
number  of  iterates.  We  will  later  see  that  these  decreases  in  a{N)  occur  along  short 
stretches  of  the  orbit,  {a:„},  where  small  differences  in  the  parameter  value  of  the  system 
can  easily  be  distinguished  by  even  noisy  state  data. 

Applying  theory  to  develop  estimation  algorithms 

Figures  1.1  and  1.2  illustrate  two  interesting  properties  for  the  quadratic  map  exam¬ 
ple:  (1)  there  is  an  asymmetry  in  the  shadowing  behavior  of  maps  in  parameter  space, 
and  (2)  most  iterates  of  a  specific  orbit  are  apparently  not  very  sensitive  to  small  changes 
in  parameter  values,  while  a  few  special  iterates  may  be  especially  sensitive  to  parameter 
changes.  It  turns  out  that  these  two  properties  can  be  extremely  helpful  in  developing 
an  algorithm  to  do  parameter  estimation. 

First  of  all,  the  asymmetry  illustrated  in  figure  1.1  can  be  quite  helpful.  For  instance, 
in  the  example  we  just  considered,  few  maps,  /p,  with  parameter  values  lower  than  po 
have  orbits  that  can  shadow  the  given  orbit  of  fp^.  Suppose  that  we  are  given  noisy 
measurements  of  a  state  orbit,  {xn}.  If  we  find  that  only  maps  from  a  certain  interval 
in  parameter  space  can  shadow  the  observed  data,  then  the  real  parameter  value  should 
be  close  to  the  lower  endpoint  of  this  parameter  range.  Thus,  to  first  order,  if  eo  is  the 
magnitude  of  measurement  error,  the  error  in  the  parameter  estimate  is  approximately 
governed  by  either  a{N)  or  b{N),  whichever  one  happens  to  be  smaller. 

In  addition,  we  will  see  later  that  figure  1.2  reflects  the  fact  that  a  few  sections  of 
the  observed  state  data  stream  contribute  greatly  to  our  knowledge  of  the  parameters  of 
the  system,  while  much  of  the  rest  of  the  data  contributes  almost  no  new  information. 
Thus,  if  we  can  quickly  sift  through  all  the  useless  data  and  examine  the  critical  data 
very  carefully,  we  should  be  able  to  vastly  improve  a  parameter  estimation  technique. 

The  key  to  this  is  whether  or  not  physically  interesting  systems  have  the  properties 
described  above.  A  major  objective  of  this  report  will  be  to  investigate  the  relevant 
mechanisms  behind  the  two  properties  and  explore  what  types  of  systems  might  exhi  it 
these  properties.  We  will  then  investigate  how  to  take  advantage  of  these  two  properties 
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p-  3.9  (Deviation  in  parameter  vaiue  from  p_0) 

Figure  1.1:  Grapli  of  best  shadowing  distance,  e(N,p),  with  respect  to  p,  for  fp  =  px{l  —  x), 
Po  =  3.9  and  Xo  =  0.3.  e(N,p)  measures  how  closely  an  orbit  of  fp  can  shadow  a  fixed  orbit, 
of  /p„.  On  the  a;— axis  of  the  graph,  p  is  labeled  as  p  —  po,  the  difference  in  parameter 
value  from  the  parameter  used  to  generate  is  plotted  on  the  t/— axis  with  N 

held  constant  for  three  different  values  of  N.  The  three  v-shaped  curves  represent  e{N,p)  for 
N  =  61,  N  =  250,  and  N  =  1000.  Note  the  distinct  asymmetry  in  how  weU  orbits  of  fp  track 

for  p  >  po  and  p  <  po- 
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N  (Number  of  states  used) 

Figure  1.2:  Graph  of  a{N)  and  b{N)  with  respect  to  N  for  eo  =  0.01.  a{N)  is  a  measure  of 
the  number  of  parameter  values,  p  <  po,  such  that  there  exists  an  orbit  of  fp  that  can  shadow 
the  orbit,  of  fp„  with  less  than  Cq  error.  Similarly  b{N)  measures  the  number  of 

parameter  values,  p  >  po,  such  that  fp  that  can  shadow  the  orbit,  {a:n}^=o»  with  less  than  Cq 
error. 


14 


to  produce  superior  parameter  estimation  algorithms. 


1.3  New  results  and  previous  work 

1.3.1  Dynamical  theory  and  shadowing  orbits 

There  has  not  been  much  work  directly  attacking  the  parameter  estimation  problem  for 
chaotic  systems.  However,  as  we  saw  in  the  previous  section,  the  feasibility  of  parameter 
estimation  is  closely  related  to  the  concept  of  shadowing  orbits. 

For  uniformly  hyperbolic  systems,  it  is  well  know  that  orbits  of  perturbed  systems 
shadow  orbits  of  the  original  system  forever  ([7], [4]).  Applying  this  result  to  parameter 
estimation,  we  find  that  one  cannot  expect  to  get  accurate  information  about  the  pa¬ 
rameters  of  a  hyperbolic  system  based  on  state  data,  since  it  is  difficult  to  distinguish 
orbits  from  systems  with  two  nearby  parameter  values. 

However,  most  physically  interesting  chaotic  systems  are  not  in  fact  hyperbolic.  In 
general®,  one  can  only  expect  so-called  subexponentially  hyperbolic  behavior  (see  eg., 
[52]),  so  that  hyperbolicity  on  a  state  orbit  is  available  on  a  local  scale,  but  is  not  uniform 
over  an  infinite  orbit.  The  result  is  that  most  finite  pseudo-orbits^  of  a  system  can  be 
shadowed  closely  by  a  real  orbit  of  that  system.  This  observation  was  made  in  [24], 
where  attempts  were  also  made  to  establish  bounds  on  the  shadowing  behavior  of  finite 
orbits  in  nonhyperbolic  systems  by  using  linearization  to  exploit  the  locally  hyperbolic 
behavior  along  a  typical  state  orbit.  Such  work  received  interest  because  shadowing  was 
thought  of  as  a  helpful  property  that  lends  credibility  to  computer-generated  orbits  with 
roundoff  error. 

In  the  case  of  parameter  estimation,  the  hyperbolic  degeneracies  that  prevent  shad¬ 
owing  behavior  are  in  fact  the  focus  of  most  of  the  interest.  This  is  unlike  past  work 
involving  shadowing  orbits,  because  in  order  to  investigate  the  feasibility  of  parameter 
estimation,  it  is  important  to  specifically  examine  the  mechanism  behind  the  lack  of 
shadowing  behavior  in  nonhyperbolic  systems.  In  addition,  it  is  also  necessary  to  exam¬ 
ine  carefully  how  orbits  for  one  parameter  value  can  shadow  orbits  for  systems  with  a 
continuum  of  different  parameter  values. 

The  result  is  that  we  find  that  most  measurements  of  the  state  of  a  system  contain 
comparatively  little  information  about  the  parameters  of  the  system  except  for  those 
iterates  where  the  hyperbolic  behavior  of  a  system  becomes  degenerate.  This  is  the 
phenomenon  we  observed  with  the  quadratic  map. 

®for  example,  for  almost  all  diffeomorphisms 

pseudo-orbit  of  a  map,  g,  is  a  sequence  of  states  {z„}  such  that  Zn+i  =  fi^n)  +  Vn  for  all  n, 
where  the  magnitude  of  the  noise,  |v„|,  is  assumed  to  be  small. 
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In  this  report,  we  discuss  how  the  lack  of  shadowing  behavior  seems  to  be  the  result 
of  a  mechanism  which  shall  be  referred  to  as  folding  in  state  space.  It  also  seems  that  this 
folding  behavior  tends  naturally  to  result  in  one-sided  shadowing  behavior  in  parameter 
space,  making  it  possible  to  effectively  distinguish  parameter  values  near  areas  where 
folding  occurs. 

For  one  dimensional  maps  like  the  quadratic  map,  we  have  been  able  to  characterize 
the  results  quantitatively.  For  example,  for  the  quadratic  map,  fp{x)  =  px{l  —  x),  we 
show  that  the  following  is  true; 

Proposition:  Let 

e{p,po,xo)  =  /imjv-»ooe(iV,p,po5  a;o) 

where  e(p,po,xo)  is  as  given  in  (1.2).  There  exist  constants  ^  >  0,  C  >  0,  and  if  >  0 
such  that  the  following  is  true;  For  any  7  €  (0, 1),  there  is  a  set,  E{j)  C  [0,4],  of  positive 
Lebesgue  measure  such  that  if  po  E  E('y),tken  : 

(1)  Forxo€[0,4], 

e(p,po,Xo)  <  C\p-po\^ 

for  all  p  G  {poiPo  +  ^)- 

(2)  For  almost  all  xq  G  [0,4], 

e{p,po,xo)  >  Kip-poV 

for  all  p  G  (po  —  ^iPo)- 
This  follows  from  Theorem  3.4.2. 

From  the  proposition  we  see  that  there  can  in  fact  be  a  pronounced  asymmetry  in 
the  shadowing  behavior  of  orbits  in  parameter  space  and  that  this  phenomenon  is  quite 
prevalent.  For  the  quadratic  maps  (1.1)  with  positive  Lyapunov  exponents,  it  can  also 
be  shown  that  the  asymmetry  always  favors  one  particular  direction  in  parameter  space 
for  maps.  That  is,  it  is  always  easier  for  orbits  of  maps  with  slightly  higher  parameters 
to  shadow  orbits  of  maps  with  slightly  lower  parameters. 

For  more  complicated  systems,  like  systems  in  higher  dimensions,  it  is  more  diffi¬ 
cult  to  establish  definite  analytical  results.  However  we  present  numerical  results  that 
demonstrate  that  surprisingly  many  systems  have  the  properties  discussed,  namely  that 
(1)  a  small  fraction  of  the  data  contains  most  of  the  information  about  the  parameters 
of  the  system,  and  (2)  there  is  an  asymmetry  in  the  behavior  of  shadowing  orbits  in 
parameter  space. 
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1.3.2  Parameter  estimation  techniques 

Traditionally,  parameter  estimation  is  carried  out  numerically  using  algorithms  like  the 
extended  Kalman  filter.  However,  we  will  demonstrate  that  algorithms  like  the  extended 
Kalman  filter  that  linearize  state  and  parameter  space  around  a  certain  trajectory  ac¬ 
tually  perform  worse  than  one  might  expect  simply  from  linearization  errors.  This  is 
basically  because  most  of  the  information  about  the  parameters  are  contained  in  a  small 
number  of  data  points,  the  very  data  points  where  nonlinear  folding  behavior  is  most 
important.  Techniques  like  the  extended  Kalman  filter  have  a  difficult  time  modeling  the 
folding  behavior  of  state  space  with  these  data  points,  along  with  the  local  exponential 
expansion  and  contraction  properties  of  state  space  in  chaotic  systems.  The  result  is 
that  these  algorithms  typically  diverge.  In  other  words,  the  algorithm’s  estimate  of  the 
error  in  its  parameter  estimate  quickly  becomes  much  less  than  the  actual  error,  so  that 
the  algorithm  ends  up  converging  to  the  wrong  parameter  value. 

In  this  report,  we  describe  a  new  algorithm  for  performing  parameter  estimation  on 
chaotic  systems  and  show  numerical  results  demonstrating  the  effectiveness  of  the  new 
algorithm  and  comparing  the  algorithm  with  traditional  techniques.  The  new  algorithm 
attempts  to  sift  through  most  of  the  data  quickly,  concentrating  on  the  measurements 
that  are  most  sensitive  to  parameter  values.  The  algorithm  then  uses  a  technique,  based 
on  a  Monte  Carlo  method,  to  pick  out  a  parameter  estimate  by  taking  advantage  of  the 
fact  that  shadowing  behavior  tends  to  be  asymmetrical  in  parameter  space. 


1.4  Overview 

This  report  is  divided  into  two  major  parts.  The  first  part,  which  includes  chapters 
2-4,  discusses  theoretical  results  concerning  parameter  estimation  in  chaotic  systems.  In 
particular,  we  are  interested  in  questions  like:  (1)  What  possible  constraints  are  there 
to  the  accuracy  of  parameter  estimates,  and  what  kind  of  accuracy  can  one  expect  given 
large  amounts  of  data?  (2)  How  is  the  accuracy  of  a  parameter  estimate  likely  to  depend 
on  the  magnitude  of  the  measurement  error  and  the  number  of  state  measurements 
available?  (3)  What  types  of  systems  exhibit  the  most  sensitivity  to  small  parameter 
changes,  and  what  types  of  systems  are  likely  to  produce  the  most  (and  least)  accurate 
parameter  estimates?  Basically  we  want  to  understand  exactly  how  much  information 
state  samples  actually  contain  about  the  parameters  of  various  types  of  systems. 

In  order  to  answer  these  questions,  we  first  examine  how  parameter  estimation  relates 
to  well-known  concepts  like  shadowing,  hyperbolicity,  and  structural  stability.  Chapter 
2  discusses  how  the  established  theory  concerning  these  concepts  relates  to  the  problem 
of  parameter  estimation.  We  also  examine  what  types  of  systems  are  guaranteed  to  have 
topologically  stable  sorts  of  behavior  and  how  this  constrains  our  ability  to  do  parameter 
estimation. 
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In  chapter  3,  we  examine  one-dimensional  maps.  Because  of  the  relative  simplicity 
of  these  systems,  they  are  ideal  for  investigating  how  the  specific  geometry  of  a  systein 
relates  to  parameter  estimation,  especially  when  one  is  dealing  with  systems  that  are  not 
topologically  or  structurally  stable.  New  quantitative  results  are  obtained  ^ncerning 
how  orbits  for  nearby  parameter  values  shadow  each  other  in  certain  one- dimensional 

families  of  maps. 

In  chapter  4  we  examine  non-uniformly  hyperbolic  systems  of  dimension  greater  than 
one.  In  such  general  settings  it  is  difficult  to  make  quantitative  statements  concerning 
limits  to  parameter  estimation.  However,  we  extend  ideas  from  the  analysis  of  one¬ 
dimensional  systems  to  suggest  mechanisms  that  determine  the  shadowing  behavior 
of  orbits.  These  mechanisms  result  from  an  examination  of  the  stable  and  unstab  e 
manifolds  of  the  systems.  Although  the  conjectures  we  make  are  not  rigorously  proved, 
they  are  supported  by  numerical  evidence. 

The  second  major  part  of  the  report  (comprising  chapter  5)  describes  an  effort  to 
use  the  dynamical  systems  theory  to  develop  a  reasonable  algorithm  to  numerically  esti¬ 
mate  the  parameters  of  a  system  given  noisy  state  samples.  We  discuss  why  traditional 
methods  of  parameter  estimation  have  problems,  and  some  ways  to  fix  these  problems. 

In  chapter  6  we  present  numerical  results  demonstrating  the  effectiveness  of  the  new 
estimation  techniques  proposed. 

Chapter  7  summarizes  the  main  conclusions  of  this  report,  and  suggests  possible 
future  work. 
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Chapter  2 

Parameter  estimation,  shadowing, 
and  structural  stability 


In  this  chapter  we  review  a  variety  of  established  mathematical  results  and  apply  these 
results  to  an  analysis  of  parameter  estimation.  In  particular,  we  examine  how  topolog¬ 
ical  stability  results  for  certain  types  of  systems  constrain  the  feasibility  of  parameter 
estimation. 


2.1  Preliminaries  and  definitions 

In  this  section,  we  introduce  some  of  the  basic  definitions  and  tools  needed  to  analyze 
problems  related  to  parameter  estimation.  We  begin  by  restating  a  mathematical  de¬ 
scription  of  the  problem.  We  are  given  the  family  of  discrete  mappings,  fp:M—>M 
where  M  is  a  smooth  compact  manifold  and  p  represents  the  invariant  parameters  of 
the  system.  For  the  purposes  of  this  report,  we  will  also  assume  that  p  is  a  scalar  so 
that  fp  represents  a  one-parameter  family  of  maps  for  p  ^  Ip,  where  /p  C  IR  is  a  closed 
interval  of  the  real  line.  Note  that  it  will  often  be  convenient  to  write  f{x,p)  in  place  of 
fp{x)  to  denote  functional  dependence  on  both  x  and  p.  We  will  assume  that  this  joint 
function  of  state  and  parameters,  f  :  M  x  Ip  —>  M,  is  continuous  over  its  domain. 

The  data  we  are  given  consists  of  a  sequence,  {pn},  of  noisy  observations  of  the  state 
vectors,  {xn}?  where  pn  G  M,  €  M,  and: 

Xn+l  —  fpiXrt) 

Vn  G  BiXfi,  c) 

for  all  n  €  Z  where  e  >  0  and  represents  an  e— neighborhood  of  x„  (ie.,  ?/„  G 

B{xn,t)  if  and  only  if  d(?/„,Xn)  <  e  for  some  distance  metric  d).  In  other  words,  the 
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measured  data,  ?/„,  consists  of  the  actual  state  of  the  system,  x^,  plus  some  noise  of 
magnitude  e  or  less. 

Note  that  if  we  fix  po  €  /p,  we  can  generate  an  orbit,  given  an  initial  condition, 
ajQ.  Basically,  we  would  like  to  know  how  much  information  this  state  orbit  contains 
about  the  parameters  of  the  system.  In  other  words,  within  possible  measurement  error, 
can  we  resolve  {xn}  from  orbits  of  nearby  systems  in  parameter  space?  In  particular, 
are  there  parameters  near  po  that  have  no  orbits  that  closely  follow  {a:„}?  If  so,  then 
we  know  that  such  parameters  could  not  possibly  produce  the  state  data  represented  by 
and  we  can  thus  eliminate  these  parameters  as  possible  choices  for  the  parameter 
estimate.  Thus,  given  po  €  Ip  and  a  state  orbit,  {x„},  of  /pp,  one  important  question 
to  ask  is:  For  what  values  of  p  €  /p  does  there  exist  an  orbit,  {zn},  of  /p  such  that 
d{zn,  Xn)  <  e  for  all  n? 

This  relates  parameter  estimation  to  the  concept  of  shadowing.  Below  we  describe 
some  definitions  for  various  types  of  shadowing  that  will  be  useful  later  on: 

Definitions:  Let  5  :  M  M  be  continuous.  Suppose  d{g{zn),Zn+i)  <  S  for  all  n. 
Then  {zn}  is  said  to  be  a  8 -pseudo-orbit  of  g.  We  say  that  a  sequence  of  states,  {x„}, 
e-shadows  another  sequence  of  states,  {yn},  if  d{xn,yn)  <  ^  for  all  n.  The  map  g  has 
the  pseudo-orbit  shadowing  property  if  for  any  e  >  0,  there  is  a  ^  >  0  such  that  every 
^-pseudo-orbit  is  e-shadowed  by  a  real  orbit  of  g.  The  family  of  maps,  {/p|p  €  /p},  is 
said  to  have  the  parameter  shadowing  property  at  po  €  /p  if  for  any  e  >  0,  there  exists  a 
^  >  0  such  that  every  orbit  of  fp^  is  e-shadowed  by  some  orbit  of  /p  for  any  p  G  B(po,  5). 
Finally,  suppose  that  g  ^  X  where  X  is  some  metric  space.  Suppose  further  that  for 
any  e  >  0,  there  is  a  neighborhood  of  g,  U  C  X,  such  that  if  p  G  then  any  orbit  of  g 
is  e-shadowed  by  an  orbit  of  g'.  Then  g  is  said  to  have  a  function  shadowing  property 

in  X 

We  can  see  that  the  various  types  of  shadowing  have  natural  connections  to  parameter 
estimation.  If  two  orbits  e— shadow  each  other,  then  these  two  orbits  will  (to  first  order) 
be  indistinguishable  from  each  other  with  measurement  noise  of  magnitude  e.  If  /pg  has 
the  parameter  shadowing  property,  then  all  systems  near  p  =  po  in  parameter  space  have 
orbits  that  e-shadow  orbits  of  fp,.  This  implies  inherent  constraints  on  the  attainable 
accuracy  of  parameter  estimation  based  on  state  data,  since  observable  state  differences 
for  nearby  systems  in  parameter  space  are  lost  in  the  noise  caused  by  measurement 
errors. 

Thus  parameter  shadowing  is  really  the  property  we  are  most  interested  in  because 
of  its  direct  relationship  with  parameter  estimation.  The  concept  of  function  shadowing 
is  simply  a  generalization  of  parameter  shadowing  so  that  given  some  function  p,  we  can 
guarantee  that  any  continuous  parameterization  of  systems  containing  p  must  have  the 
parameter  shadowing  property  at  p.  This  situation  implies  that  the  state  evolution  of 
the  system  is  in  some  sense  stable  or  insensitive  to  small  perturbations  in  the  system. 
In  the  literature,  the  following  language  is  used  to  describe  this  sort  of  stability. 
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Definitions:  Two  continuous  maps,  f  :  M  M  and  g  :  M  M,  are  said  to  be  topo¬ 
logically  conjugate  if  there  exists  a  homeomorphism,  h,  such  that  gh  =  hf.  Let  Diff^[M) 
be  the  space  of  C'’"  diffeomorphisms  of  M.  Then  g  €  is  said  to  be  structurally 

stable  if  for  every  neighborhood,  U  €  of  the  identity  function,  there  is  a 

neighborhood,  V  C  of  g  such  that  for  each  f  E  V  there:  exists  a  homeomor¬ 

phism,  hg  €  t/,  satisfying  /  =  hJ^ghj.  In  addition,  if  there  exists  a  constant  K  >  0  and 
neighborhood  V  CV  oi  g  such  that  sup^^]^fd{hf{x),x)  <  K  d{f{x),g{x)),  for 

any  /  G  V,  then  g  is  said  to  be  absolutely  structurally  stable. 

Unfortunately,  we  have  introduced  a  rather  large  number  of  definitions.  Some  of  the 
definitions  apply  directly  to  parameter  estimation,  and  others  are  introduced  because 
they  are  historically  important  and  are  necessary  in  order  to  apply  results  found  in  the 
literature.  Before  we  continue,  it  is  important  to  state  clearly  how  the  various  properties 
are  related  and  exactly  what  they  mean  for  parameter  estimation. 


2.2  Shadowing  and  structural  stability 

We  now  investigate  the  relationship  between  various  shadowing  properties  and  structural 
stability.  The  goal  here  is  to  relate  well-known  concepts  like  pseudo-orbit  shadowing  and 
structural  stability  to  parameter  and  function  shadowing,  so  that  we  can  apply  results 
from  the  literature. 

Let  us  begin  with  a  brief  discussion.  First  of  all,  given  any  po  G  Ip,  note  that  if  p  is 
near  po,  then  orbits  of  fp  are  pseudo-orbits  of  fp^.  The  pseudo-orbit  shadowing  property 
implies  that  a  particular  system  can  shadow  all  trajectories  of  nearby  systems.  That 
is,  any  orbit  of  a  nearby  system  can  be  shadowed  by  an  orbit  of  the  given  system.  On 
the  other  hand,  function  shadowing  is  somewhat  the  opposite.  A  system  exhibits  the 
function  shadowing  property  if  all  nearby  systems  can  shadow  it.  Meanwhile,  structural 
stability  implies  a  one-to-one  correspondence  between  orbits  of  all  systems  within  a  given 
neighborhood  in  function  space.  Thus,  if  a  system  is  structurally  stable,  then  all  nearby 
systems  can  shadow  each  other. 

While  these  three  properties  are  not  equivalent  in  general  they  are  apparently  equiv¬ 
alent  for  certain  types  of  expansive  maps,  where  the  definition  of  expansiveness  is  given 
below: 

Definitions:  A  homeomorphism  g  :  M  M  is  said  to  be  expansive  if  there  exists 
e{g)  >  0  such  that 


d{9'^{x),g^{y))  <  e{g) 


for  n  E  Z  ii  and  only  if  a;  =  y.^  e{g)  is  called  the  expansive  constant  for  g.  Also,  suppose 
^Note  that  in  general,  if  5:  is  a  function  then  we  will  write  to  mean  the  function  g  composed  with 
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X  is  a  metric  space  of  homeomorphisms.  Then  a  function  g  E  X  is  uniformly  expansive 
in  X  if  there  exists  a  neighborhood  V  C  X  oi  g  such  that  *n//gy(e(/))  >  0. 

We  now  state  some  properties  relating  pseudo-orbit  shadowing,  function  shadowing, 
and  structural  stability.  Many  of  these  results  are  addressed  by  Walters  m  [62].  We  refer 
the  reader  to  [62]  and  fill  in  the  gaps  as  necessary  in  Appendix  A. 

Theorem  2.2.1  Let  g  :  M  M  be  a  structurally  stable  diffeomorphism.  Then  g  has 
the  function  shadowing  property. 

Proof:  This  follows  directly  from  the  definitions  of  structural  stability  and  function  shad¬ 
owing.  The  conjugating  homeomorphism,  h,  from  the  definition  of  structural  stability 
provides  a  one-to-one  connection  between  shadowing  orbits  of  nearby  maps. 

Theorem  2.2.2  (Walters)  Let  g  :  M  M  be  a  structurally  stable  diffeomorphism  of 
dimension  >  2.  Then  g  has  the  pseudo-orbit  shadowing  property. 

Proof:  This  follows  directly  from  Theorem  11  of  [62].  The  proof  is  not  as  simple  as  the 
previous  theorem,  since  a  pseudo-orbit  of  g  is  not  necessarily  a  real  orbit  of  a 
map.  However,  Walters  shows  that  given  a  pseudo-orbit  of  g,  we  can  pick  a  (possibly) 
different  pseudo-orbit  of  g  that  both  shadows  the  original  pseudo-orbit  and  is  in  fact  a 
true  orbit  of  a  nearby  map.  Then  structural  stability  can  be  invoked  to  to  show  that 
there  must  be  a  real  orbit  of  g  that  shadows  the  original  pseudo-orbit. 

Theorem  2.2.3  Let  g  :  M  M  be  an  expansive  diffeomorphism  with  the  pseudo-orbit 
shadowing  property.  Suppose  there  exists  a  neighborhood,  V  C  Diff  ^(M)  of  g  that  is 
uniformly  expansive.  Then  g  is  structurally  stable. 

Proof:  This  follows  from  discussions  in  [62].  See  Appendix  A. 

Theorem  2.2.4  ;  Let  g  :  M  M  be  an  expansive  diffeomorphism  with  the  function 
shadowing  property.  Suppose  there  exists  a  neighborhood,  V  C  Diff  ^(M)  of  g  such  that 
V  is  uniformly  expansive.  Then  g  is  structurally  stable. 

Proof:  This  is  similar  to  theorem  4  of  [62].  See  Appendix  A. 

Summarizing  our  results  relating  various  forms  of  shadowing  and  structural  stability, 
we  find  that  structural  stability  is  the  strongest  condition  considered.  Structural  sta¬ 
bility  of  a  diffeomorphism  of  greater  than  one  dimension  implies  both  the  pseudo-orbit 

itself  n  times.  We  assume  that  g°  is  the  identity  function. 
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shadowing  and  parameter  shadowing  properties  for  continuous  families  of  mappings. 
Thus  we  can  use  the  literature  on  structural  stability  to  show  that  certain  families  of 
maps  must  have  parameter  shadowing  properties,  making  it  difficult  to  accurately  esti¬ 
mate  parameters  given  state  data.  As  we  shall  see,  however,  most  systems  we  are  likely 
to  encounter  in  physical  applications  are  not  structurally  stable. 

Also,  the  pseudo-orbit  shadowing  property,  parameter  shadowing  property,  and  struc¬ 
turally  stability  are  equivalent  for  expansive  diffeomorphisms  g  :  M  M  of  dimension 
greater  than  one  if  there  exists  a  neighborhood  of  g  in  that  is  uniformly 

expansive.  However,  again  we  shall  see  that  most  physical  systems  do  not  have  this 
expansiveness  property.  Note  also  that  these  results  do  not  apply  to  the  maps  of  the 
interval  which  we  consider  in  the  next  chapter. 


2.3  Absolute  structural  stability  and  parameter  es¬ 
timation 

There  is  one  more  useful  property  we  have  not  yet  addressed.  That  is  the  concept  of 
absolute  structural  stability. 

Lemma  2.3.1  Suppose  that  fp  €  Diff  ^(Af)  for  p  €  /p  C  R,  and  let  f{x,p)  =  fp{x) 
for  any  x  £  M.  Suppose  that  f  :  M  x  Ip  ^  M  is  and  that  fp^  is  an  absolutely 
structurally  stable  diffeomorphism  for  some  po  £  Ip.  Then  there  exist  co  >  0  and  A  >  0 
such  that  for  every  positive  e  <  Cq,  any  orbit  of  fp^  can  be  e— shadowed  by  an  orbit  of  fp 
ifp  £  B{po,Ke). 

Proof:  This  follows  fairly  directly  from  the  definition  of  absolute  structural  stability. 
The  conjugating  homeomorphism  provides  the  connection  between  shadowing  orbits. 
See  Appendix  A  for  a  complete  explanation. 

Thus  if  an  absolutely  structurally  stable  mapping,  g,  is  a  member  of  a  continuous 
parameterization  of  mappings,  then  nearby  maps  in  parameter  space  can  e-shadow  any 
orbit  of  g.  Furthermore,  from  above  we  see  that  the  range  of  parameters  that  can  shadow 
orbits  of  g  varies  at  most  linearly  with  e  for  sufficiently  small  e  so  that  decreasing  the 
measurement  error  will  not  result  in  any  dramatic  improvements  in  estimation  accuracy. 
In  these  systems,  it  is  clear  that  dynamics  does  not  contribute  a  great  deal  to  our  ability 
to  distinguish  between  the  behavior  of  nearby  systems.  In  the  next  section,  we  shall  see 
that  so-called  uniformly  hyperbolic  systems  can  exhibit  this  absolute  structural  stability 
property,  making  them  poor  systems  for  accurate  parameter  estimation. 
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2.4  Uniformly  hyperbolic  systems 

Let  us  now  turn  turn  our  attention  to  identifying  what  types  of  systems  exhibit  the 
various  shadowing  and  structural  stability  properties  described  in  the  previous  section. 
Stability  is  intimately  associated  with  hyperbolicity,  so  we  begin  by  examining  uniform  y 

hyperbolic  systems. 

Uniformly  hyperbolic  systems  are  interesting  as  the  archetypes  for  complex  behavior 
in  nonlinear  systems.  Because  of  the  definite  structure  available  in  such  systeins  it  is  pn- 
erally  easier  to  prove  results  in  this  case  than  for  more  general  situations.  Unfortunately, 
from  a  practical  viewpoint,  very  few  physical  systems  actually  exhibit  the  prperties  of 
uniform  hyperbolicity.  Nevertheless,  understanding  hyperbolicity  is  important  as  a  tirst 
step  to  figuring  out  what  is  happening  in  more  general  situations. 

Our  goal  in  this  section  is  to  state  some  stability  results  for  hyperbolic  systems  and 
to  motivate  the  connections  between  hyperbolicity,  stability,  and  paramepr  ptimation. 
Most  of  the  results  in  this  section  are  well-known  and  have  been  written  about  in  nump- 
ous  sources.  The  material  provided  here  outlines  some  of  the  propptip  of  hyperbolic 
systems  that  pertain  to  our  treatment  of  parameter  estimation.  The  brief  dpcussions 
use  informal  arguments  in  an  attempt  to  motivate  ideas  rather  than  povide  proofs 
References  to  more  rigorous  proofs  are  given.  For  an  overview  of  some  of  the  njatpial  m 
this  section,  a  few  good  sources  include:  Shub  [55],  Nitecki  [43],  Palis  and  de  Melo  [50], 

or  Newhouse  [42]. 

We  first  need  to  know  what  it  means  to  be  hyperbolic: 

Definitions: 


fl)  Given  o  :  M  ->  M,  A  is  a  (uniformly)  hyperbolic  set  of  g  if  there  exists  a  continuous 
invariant  splitting  of  the  tangent  bundle,  ®  for  all  x  G  A  an 

constants  G  >  0  and  A  >  1  such  that: 


{&)\Dg^v\  <  GA-’^lnl  \{veE:,n>0 
lh)\Dg-^v\  <  GA-”lul  if  u  G  >  0 


(2)  A  diffeomorphism  j  :  M  ^  M  is  said  to  be  Anosov  if  M  is  oniformly  hyperbolic. 


One  important  property  tor  understanding  the  behavior  of  hyperbolic  systems  are 
the  existence  of  smooth  uniformly  contracting  and  expanding  manifolds. 

Definition;  We  define  the  local  stable,  and  unstable,  W^(x,g),  sets  of  g  : 

M  M  as  follows: 

W^{x,g)  =  {j/ G  M  :  d(sf"(x),5r”(t/))  <  e  for  all  n  >  0  } 
w:{x,g)  =  {y  €  M  :  %""(x),y-"(y))  <  e  for  all  n  >  0  } 
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We  define  the  global  stable,  W^{x,g),  and  unstable,  W'^{x,g),  sets  of  :  M  — >  M  as 
follows: 


=  {y  €  M  :  d(5f”(x),5f”(y)) 0  as  n  ^  00} 

W'^{x,g)  =  {2/ 6  M  :  d(5f"”(a:),5f~”(t/))  ^  0  as  n  ^  00}. 

The  following  result  shows  that  these  sets  have  definite  structure.  Based  on  this 
result,  we  replace  the  word  “set”  with  the  word  “manifold”  in  the  definitions  above,  so, 
for  example,  W^{x,g)  and  W'^{x,g)  are  the  stable  and  unstable  manifolds  of  g  at  x. 

Theorem  2.4.1  (Stable/unstable  manifold  theorem  for  hyperbolic  sets):  Let  g  :  M 
M  be  a  diffeomorphism  (r  >  1),  and  let  A  G  M  be  a  compact  invariant  hyperbolic 
set  under  g.  Then  for  sufficiently  small  e  >  0  the  following  properties  hold  for  x  E  A: 

(1)  W/(a:,5)  and  Wf{x,g)  are  local  C”  disks  for  any  x  E  A.  W^{x,g)  is  tangent  to 
at  X  and  Wf{x,g)  is  tangent  to  Ef  at  x. 

(2)  There  exist  constants  (7  >  0  and  A  >  1  such  that: 

d{g'^{x),g'^{y))  <  CA"”  for  alln>0  ifyE  W/(a;) 
d(g~^{x),g~'^{y))  <  CA"”  for  alln>0  if  y  E  Wf{x). 

(3)  Wf{x)  and  Wf{x)  vary  continuously  with  x. 

(4)  We  can  choose  an  adaptive  metric  such  that  C  =  I  in  (2). 

Proof:  See  Nitecki  [43]  or  Shub  [55]. 

Note  that  from  (2)  above,  we  can  see  that  our  definitions  for  the  global  stable 
and  unstable  manifolds  are  natural  extensions  of  the  local  manifolds.  In  particular, 
W,^{x,g)  C  W%x,g),  Wf{x,g)  C  W'^{x),  and: 

W‘(x,g)  =  g-'(W;(g'x)) 

n>0 

W‘(x,g)  = 

7l>0 

Thus  C”  stable  and  unstable  manifolds  vary  continuously,  and  intersect  transversally  on 
hyperbolic  sets,  meaning  that  the  angle  of  intersection  between  the  stable  and  unsta¬ 
ble  manifolds  is  bounded  away  from  zero  on  A.  These  manifolds  create  a  foliation  of 
uniformly  contracting  and  expanding  sets  that  provides  for  a  definite  structure  of  the 
space.  We  will  now  argue  that  uniformly  hyperbolic  systems  obey  shadowing  properties 
and  are  structurally  stable. 
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Lemma  2.4.1  (Shadowing  Lemma):  Let  g  :  M  ^  M  he  a  diffeomorphism  (r  >  1), 
and  let  A.  G  M  be  a  compact  invariant  hyperbolic  set  under  g.  Then  there  exists  a 
neighborhood,  U  C  M,  of  K  such  that  g  has  the  pseudo-orbit  shadowing  property  on  U. 
That  is,  given  e  >  0,  there  exists  S  >  0  such  that  if  {zn}  is  a  S -pseudo-orbit  of  g,  with 
z,,  eU  for  all  n,  then  {z„}  is  e-shadowed  by  a  real  orbit,  {xn},  of  g  such  that  Xn  €  A 

for  all  integer  n. 

Proof:  Proofs  for  this  result  can  be  found  in  [7]  and  [55].  Here  we  sketch  an  informal 
argument  similar  to  the  one  given  by  Conley  [16]  and  Ornstein  and  Weiss  [47]  for  the 
case  where  g  is  Anosov  (ie,  A  =  M  is  hyperbolic). 

Let  {zn}  be  a  <5-pseudo-orbit  of  g  and  let  Bn  =  B{zn,e).  For  the  pseudo-orbit  shad¬ 
owing  property  to  be  true,  there  must  be  a  real  orbit,  {x„},  of  g  such  that  x„  G  Bn  for 
all  integer  n.  Thus  it  is  sufficient  to  show  that  for  any  e  >  0  there  is  a  ^  >  0  such  that 
given  any  5-pseudo-orbit  of  g,  there  exists  xo  G  A  satisfying. 

xoe  f]  g~'"{B{zn,e)).  (2.1) 

n^Z 


Since  the  stable  and  unstable  manifolds  intersect  transversally  (at  angles  uniformly 
bounded  away  from  zero),  for  any  p  G  A,  we  can  use  the  structure  of  the  manifolds  around 
p  to  define  a  local  coordinate  system  for  uniformly  large  neighborhoods,  of  p  G  A.  We 
can  think  of  this  as  locally  mapping  the  stable  and  unstable  manifolds  onto  a  patch  of  E 
such  that  stable  and  unstable  manifolds  lie  parallel  to  the  axes  of  a  Cartesian  grid  (see 
figure  2.1).  Also  we  can  choose  an  adapted  metric  on  A  (specified  in  part  (4)  of  the  stable 
manifold  theorem),  for  each  p  G  A  so  that  g  has  uniform  local  contraction/expansion 
rates.  Using  this  metric  on  the  transformed  coordinates,  we  have  a  nice,  neat  model  of 
local  dynamical  behavior,  as  we  shall  see  below.  From  now  on  we  deal  exclusively  with 
transformed  local  coordinates  centered  aroimd  z„  and  the  adapted  metric.  Note  that  the 
discussion  below  and  the  pictures  reflect  the  two-dimensional  case  (the  idea  is  similar  in 
higher  dimensions). 

Now  for  all  n  pick  squares,  S{zn,  e)  =  Sn,  of  uniformly  bounded  size  centered  at 
z„  with  S{zn,  e)  C  B{zn,  e)  such  that  the  sides  of  Sn  are  parallel  to  the  axes  of  the 
transformed  coordinate  system  around  Zn-  The  sides  of  the  Sn  squares  are  fibered  by 
stable  and  unstable  manifolds,  so  when  we  apply  g  to  Sn,  the  square  is  stretched  into 
a  rectangle,  expanding  along  the  unstable  direction,  contracting  in  the  stable  direction. 
Meanwhile,  the  opposite  is  true  for  g~^.  Note  that  if  we  can  show  that  there  exists  some 
Xo  G  A  and  e  >  0  such  that: 

2The  local  coordinates  we  refer  to  here  are  known  as  canonical  coordinates.  For  a  more  rigorous 
explanation  of  these  coordinates  refer  to  Smale  [59]  or  Nitecki  [43]. 
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Original  System 


Adapted  Metric 


Figure  2.1:  First  we  use  the  structure  of  the  manifolds  of  the  hyperbolic  system  to  define  a 
local  coordinate  system  with  nice  geometric  properties,  so  that  the  manifolds  are  orthogonal 
and  and  expand/contract  uniformly  under  a  single  application  of  g. 


Figure  2.2:  For  any  e  >  0  we  can  choose  ^  >  0  so  that  for  any  n  Z,  (a)  any  line  segment, 
a“,  along  the  unstable  direction  in  5„  gets  mapped  by  g  so  that  it  intersects  5„+i,  and  (b) 
any  line  segment,  a* ,  along  the  stable  direction  in  S„  gets  mapped  by  g~^  so  that  it  intersects 
5„_i. 
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for  any  sequence,  {zn},  that  is  (5-pseudo-orbit  of  g,  then  the  shadowing  property  must 
be  true.  This  is  our  goal. 

Let  n  €  Z  and  let  a“  be  any  line  segment  extending  the  length  of  a  side  of  S{zm  e) 
parallel  to  the  unstable  direction  inside  S{zn,e).  Set  =  9{(^n)  S{zn+i,e).  Then, 
for  any  e  >  0,  we  can  choose  a  suitably  small  >  0,  such  that  for  any  n,  must  be 
nonempty  if  {zn},  is  a  pseudo  orbit,  of  g  (see  figure  2.2).  In  figure  2.2  we  see  that 

>  0  represents  the  possible  offset  between  the  centers  of  the  rectangle,  g{Sn)y  3.nd 
the  square,  Sn+i-  As  e  get  smaller,  the  size  of  the  rectangle  and  square  gets  smaller,  but 
we  can  still  choose  a  suitably  small  >  0  so  that  g{cin)  intersects  Sn+i-  Furthermore 
we  can  do  exactly  the  same  thing  in  the  opposite  direction.  That  is,  let  a*  be  any  line 
segment  extending  along  the  stable  direction  of  S{zn,  e),  set  =  g  H  S{zn-i,()i 

and  choose  ^2  >  0  suitably  small  so  that  a*_i  must  be  nonempty  for  any  n  if  {z^},  is  a 
^2— pseudo  orbit,  of  g. 

Given  any  e  >  0  set  6  =  min{^i,62}-  Then,  for  any  n  >  0,  let  a„(w)  be  a  segment 
in  Sn  =  S{zn,e)  parallel  to  the  stable  direction.  Set  al_^{n)  =  g~\al{n))  D  Sk-i  for 
any  Jb  <  n.  From  our  previous  arguments  we  know  that  as  long  as  {zn}  is  a  pseudo 
orbit  of  g,  then  al_i{n)  must  be  a  (nonempty)  line  in  the  stable  direction  within  Sk-i 
if  al{n)  is  a  line  in  the  stable  direction  of  5fc.  Consequently,  by  induction,  aS(«) 
be  a  line  in  the  stable  direction  of  Sq  for  any  n  >  0.  Furthermore  note  that  a^(n)  C  Sk 
for  any  k  €  {0,1,..., n}.  Doing  a  similar  thing  for  n  <  0,  working  with  g  instead  of 
g~^,  and  starting  with  a  segment  parallel  to  the  unstable  direction  of  Sn,  we 

see  that  for  any  n  <  0  there  exists  a  series  of  line  segments,  al{n)  C  Sk,  for  each 
k  G  {n,n  1,...,— 1,0}  oriented  in  the  imstable  direction.  Clearly  ao(— n)  and  aQ(n) 
must  intersect  for  any  n  >  0.  Now  consider  the  limit  of  this  process  as  n  >  00.  It  is  easy 
to  show  that  the  intersection  point 

xo  =  (lim  a5(n))f)(  lim  a‘^{n)) 

must  exist  and  must  in  fact  be  the  xo  we  seek  satisfying  (2.1).  This  initial  condition  can 
then  be  used  to  generate  a  suitable  shadowing  orbit,  {x„}. 


Theorem  2.4.2  Anosov  diffeomorphisms  are  structurally  stable. 

Proof:  Proofs  for  this  result  can  be  found  in  [4]  and  [37]. 

It  is  also  possible  to  prove  this  result  based  on  the  shadowing  lemma.  The  basic  idea 
is  to  show  that  any  Anosov  diffeomorphism,  g  :  M  — >  Af,  is  uniformly  expansive,  and 
then  to  apply  theorem  2.2.3  to  get  structural  stability.  Walters  does  this  in  [62].  We 
outline  the  arguments. 

The  fact  that  g  is  expansive  is  not  too  difficult  to  show.  If  this  were  not  true,  then 
there  must  exist  x  ^  y  such  that  d(5r”(x),5”(y))  <  e  all  integer  n.  But  satisfying  this 
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condition  for  both  n  >  0  and  ra  <  0  would  imply  that  y  e  W^{x^g)  and  y  E  W^{x,g), 
respectively.  This  cannot  happen  unless  x  =  y.  The  contradiction  shows  that  the  Anosov 
dilfeomorphism,  g,  must  be  expansive  with  expansive  constant,  e{g)  >  e,  where  e  >  0  is 
as  specified  in  the  stable  manifold  theorem. 

The  next  step  is  to  observe  that  there  exists  a  neighborhood,  {/,  of  g  in  Diff^{M) 
such  that  any  /  6  17  is  Anosov.  Then  since  the  stable  and  unstable  manifolds  W^{x,  f) 
and  W^{x,f)  vary  continuously  with  respect  to  f  E  U  ([28]),^  we  can  show  that  there 
exists  a  neighborhood,  U'  C  U,  oi  g  such  that  f  E  U'  is  uniformly  expansive.  Since  g 
has  the  pseudo-orbit  shadowing  property,  we  can  apply  theorem  2.2.3  to  conclude  that 
Anosov  diffeomorphisms  must  be  structurally  stable.  This  completes  our  explanation  of 
theorem  2.4.2. 

Theorem  2.4.2,  however,  is  not  the  most  general  statement  we  can  make.  We  need  a 
few  more  definitions,  however,  before  we  can  proceed  to  final  result  in  theorem  2.4.3. 

Definitions: 

(1)  A  point  X  is  nonwandering  if  for  every  neighborhood,  17,  of  s,  there  exists  arbitrarily 
large  n  such  that  f”‘{U)  HU  is  nonempty. 

(2)  A  diffeomorphism  f  :  M  M  satisfies  Axiom  A  if: 

(a)  the  nonwandering  set,  f2(/)  C  M,  is  hyperbolic. 

(b)  the  periodic  points  of  /  are  dense  in  0,{f). 

(3)  We  say  that  /  satisfies  the  strong  transversality  property  if  for  every  x  E  M, 

=  TM. 

Theorem  2.4.3  (Franks)  If  f  :  M  —>  M  is  then  f  is  absolutely  structurally  stable 
if  and  only  if  f  satisfies  Axiom  A  and  the  strong  transversality  property. 

Proof:  See  Franks  [21]. 

Intuitively,  this  result  seems  to  be  similar  to  our  discussion  of  Anosov  systems,  except 
that  hyperbolicity  is  not  available  everywhere.  However,  there  has  been  a  great  deal  of 
research  into  questions  concerning  structural  stability,  especially  whether  structurally 
stable  /  €  Diff^{M)  implies  that  /  satisfies  Axiom  A  and  the  strong  transversality 
property.  The  reader  may  refer  to  [55]  for  discussions  and  references  to  this  work. 

^Instead  of  hiding  the  details  in  this  statement  about  stable  and  unstable  manifolds,  [62]  gives  a 
more  direct  argument  (but  one  that  requires  math  background  which  I  have  tried  to  avoid  in  the  text). 
Let  B{M,  M)  be  the  Banach  manifold  of  all  maps  from  M  to  M  and  let  :  B(M,  M)  B(M,  M)  so 
that  ^f{h)  =  fhg~^.  If  f  =  g,  ^g{h)  has  a  hyperbolic  fixed  point  near  the  identity  function,  id  (where 
by  hyperbolic  we  mean  that  the  spectrum  of  the  tangent  map,  7),$,  is  disjoint  from  the  unit  circle). 
Thus  for  any  f  E  U,  ^/(h)  has  a  hyperbolic  fixed  point  near,  id,  and,  since  g  is  expansive,  this  shows 
uniform  expansiveness  for  f  EU. 
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For  our  purposes,  however,  we  now  summarize  the  implications  of  theorem  2.4.3  to 
parameter  estimation; 

Corollary  2.4.1  Suppose  that  fp  G  DifF^(M)  for  p  €  /p  C  K.,  and  let  f[x,p)  —  fp{x) 
for  any  x  €  M.  Suppose  also  that  f  :  M  x  Ip  M  is  and  that  for  some  po  €  /p, 
/p„  is  a  Axiom  A  diffeomorphism  with  the  strong  transversality  property.  Then  there 
exists  €0  >  0  and  K  >  0  such  that  for  every  positive  e  <  cq,  any  orbit  of  fp^  can  be 
e— shadowed  by  an  orbit  of  fp  if  p  €.  B{po^Ke). 

In  other  words,  Axiom  A  ditfeomorphisms  with  the  strong  transversality  satisfy 
a  function  shadowing  property.  They  are  stable  in  such  a  way  that  their  dynamics  does 
not  magnify  differences  in  parameter  values.  Chaotic  behavior  clearly  does  not  lead  to 
improved  parameter  estimates  in  this  case.  However,  as  noted  earlier,  most  known  phys¬ 
ical  systems  do  not  satisfy  the  rather  stringent  conditions  of  uniform  hyperbolicity.  In 
the  next  two  chapters  we  will  investigate  results  for  some  systems  that  are  not  uniformly 
hyperbolic,  beginning  with  the  simplest  possible  case:  dynamics  in  one  dimension. 
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Chapter  3 

Maps  of  the  interval 


In  the  last  chapter  we  examined  systems  that  are  uniformly  hyperbolic.  In  this  case, 
orbits  of  nearby  systems  have  the  same  topological  properties  and  shadow  each  other  for 
arbitrarily  long  periods  of  time.  We  would  now  consider  what  happens  for  other  types 
of  systems.  To  start  out  with,  we  will  investigate  one- dimensional  maps,  specifically, 
maps  of  the  interval.  One-dimensional  maps  are  useful  because  they  are  the  simplest 
systems  to  analyze;  yet  as  we  shall  see,  even  in  one  dimension  there  is  a  great  variety  of 
possible  behavior,  especially  if  one  is  interested  in  geometric  relationships  between  the 
shadowing  orbits  of  nearby  systems.  Such  relationships  are  important  in  assessing  the 
feasibility  of  parameter  estimation,  since  they  determine  whether  nearby  systems  can  be 
distinguished  from  each  other  in  parameter  space. 

In  section  3.1  we  begin  with  a  brief  overview  of  what  maps  of  the  interval  are  struc¬ 
turally  stable,  and  in  section  3.2  we  look  at  function  shadowing  properties  of  these  maps. 
Our  purpose  here  is  not  to  classify  maps  into  various  properties.  Although  it  is  impor¬ 
tant  to  know  what  types  of  systems  exhibit  various  shadowing  properties,  the  main  goal 
is  to  distill  out  some  archetypal  mechanisms  that  may  be  present  in  a  number  of  inter¬ 
esting  nonlinear  systems.  Especially  of  interest  are  any  mechanisms  that  may  help  us 
understand  what  occurs  in  higher  dimensional  problems. 

In  the  process  of  investigating  function  shadowing,  we  will  examine  how  the  “fold¬ 
ing”  behavior  around  turning  points  (i.e.,  relative  maxima  or  minima)  of  one-dimensional 
maps  governs  how  orbits  shadow  each  other.  This  investigation  will  be  extended  in  sec¬ 
tion  3.3,  where  we  consider  how  folding  behavior  can  often  lead  naturally  to  asymmetrical 
shadowing  behavior  in  the  parameter  space  of  maps.  This,  at  least,  gives  us  some  hint 
for  why  we  see  asymmetrical  behavior  in  a  wide  variety  of  numerical  experiments.  As 
we  will  see  in  chapter  5,  this  asymmetrical  shadowing  behavior  seems  to  be  crucial  in 
developing  methods  for  estimating  parameters,  so  it  is  important  to  try  to  understand 
where  the  behavior  comes  from. 
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In  order  to  get  definite  results,  we  will  restrict  our  claims  to  increasingly  narrow 
classes  of  mappings.  In  section  3.4  we  will  apply  our  results  to  a  specific  example, 
namely  the  one-parameter  family  of  maps  we  examined  in  chapter  1 : 

fp{x)  =  px{l  -  x). 

Finally,  in  section  3.6,  we  conclude  with  a  number  of  conjectures  and  suggestions  for 
further  research  into  parameter  dependence  in  one-dimensional  maps. 


3.1  Structural  stability 

We  first  want  to  examine  what  types  of  maps  of  the  interval  are  structurally  stable.  These 
are  not  the  types  of  maps  we  are  particularly  interested  in  for  purposes  of  parameter 
estimation,  but  it  is  good  to  identify  which  maps  they  are.  We  briefly  state  some  known 
results,  most  of  which  can  be  found  in  de  Melo  and  van  Strien  [33]. 

Note  that  since  interesting  behavior  for  maps  of  the  interval  occurs  only  in  non- 
invertible  systems,  we  must  slightly  revise  some  of  definitions  of  the  previous  section  in 
order  to  account  for  this.  In  particular,  instead  of  bi-infinite  orbits,  we  now  deal  only 
with  forward  orbits.  These  revisions  apply,  for  example,  in  the  definitions  for  various 
types  of  shadowing.  Unless  we  mention  a  new  definition  explicitly,  the  changes  are  as 
one  would  expect. 

Let  us,  however,  make  the  following  new  definitions,  some  of  which  may  be  a  bit 
different  from  the  analogous  terms  from  chapter  2.  In  the  definitions  that  follow  (and 
this  chapter  in  general)  assume  that  /  C  K  is  a  compact  interval  of  the  real  line. 

Definitions:  Suppose  that  /  :  /  /  is  continuous.  Then  the  turning  points  of  /  are 

the  local  extrema  of  /  in  the  interior  I.  C{f)  is  used  to  designate  the  set  of  all  turning 
points  of  /  on  7.  Let  Cr(/,  I)  be  the  set  of  continuous  maps  on  I  such  that  /  G  Cr(/,7) 
if  the  following  two  conditions  hold: 

(a)  /  is  C"  (for  r  >  0) 

(b)  /(/)  C  7. 

If  in  addition,  we  have  that 

(c)  f{Bd{I))  C  Bd{I)  (where  Bd{I)  denotes  the  boundary  of  7), 
then  we  say  that  /  ^  {I ,  I). 

For  either  f,ge  C(7,7)  ox  f,g  e  C (7, 7),  then  let  d{f,g)  =  sup^^/ [/(x)  -  5(a:)|. 
Definitions: 

(1)  /  G  C(7,  7)  is  said  to  be  structurally  stable  if  there  exists  a  neighborhood  U  of 
1  [33]  is  the  best  source  of  material  I  have  seen  for  results  involving  one-dimensional  dynamics. 
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/  in  C  (/,  I)  such  that  for  every  g  £  U,  there  exists  a  homeomorphism  hg  :  I  I 
such  that  ghg  =  hgf. 

(2)  Let  f  :  I  I.  The  w-limit  set  of  a  point,  x  €  /,  is: 

w{x)  =  {j/  6  /  :  there  exists  a  subsequence  {«,}  such  that  /”'(x)  — >  y 

for  some  x  E  1} 

B  is  said  to  be  the  basin  of  a  hyperbolic  periodic  attractor  if  B  =  {x  E  I  ■  p  E  i«(x)} 
where  p  is  a  periodic  point  of  /  with  period  n  and  \Df'^(p)\  <  1. 

(3)  f  E  O'  (/,  I)  is  said  to  satisfy  Axiom  A  if 

(a)  /  has  a  finite  number  of  hyperbolic  periodic  attractors 

(b)  Every  x  E  I  is  either  a  member  of  a  (uniformly)  hyperbolic  set  or  is  in  the 
basin  of  a  hyperbolic  periodic  attractor. 

The  following  theorem  is  the  one-dimensional  analog  of  theorem  2.4.3. 

Theorem  3.1.1  Suppose  that  f  E  C(/, /)  (r  >  2)  satisfies  Axiom  A  and  the  following 
conditions: 

(1)  If  cE  I  and  Df{c)  =  0,  then  c  E  C{f). 

(2)  r{C{f))  n  C(f)  =  0  for  all  n  >  0. 

Then  f  is  structurally  stable. 

Proof:  See  for  example,  theorem  III.2.5  in  [33]. 

Axiom  A  maps  are  apparently  prevalent  in  one-dimensional  systems.  For  example, 
the  following  is  believed  to  be  true: 

Conjecture  3.1.1  The  set  of  parameters  for  which  fp  =  px{l  —  x)  satisfies  Axiom  A 
forms  a  dense  set  in  [0,4]. 

Proof:  de  Melo  and  van  Strien  [33]  report  that  Swiatek  has  recently  proved  this  result 
in  [61]. 

Assuming  that  this  result  is  true,  we  can  paint  an  interesting  picture  for  the  param¬ 
eter  space  of  fp  =  px{l  —  x).  Apparently  there  are  a  dense  set  of  parameter  values  for 
which  fp  =  px{l  —  x)  has  a  hyperbolic  periodic  attractor.  The  set  of  parameter  values 
satisfying  this  property  must  be  consist  of  a  union  of  open  sets,  since  we  know  that  these 
systems  are  structurally  stable. 
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On  the  other  hand,  this  does  not  mean  that  all  or  almost  all  of  the  parameter  space 
of  f  ^  p^(i  _  a;)  is  taken  up  by  structurally  stable  systems.  In  fact  as  we  shall  see  m 
sLtn  3  4  a  po  itive  measure  of  the  parameter  space  is  actually  taken  up  by  systems 
that  are  n^t  sLcturally  stable.  These  are  the  parameter  values  that  we  w.ll  be  most 

interested  in. 


3.2  Function  shadowing 

We  now  consider  function  and  parameter  shadowing.  In  section  2.2  we  saw  that  for 
uniformly  expansive  diffeomorphisms,  structural  stability  and  function  shadowmg  are 
equivalent.  For  more  general  systems,  structural  stability  still  implies  function  shadow- 
ing,  however,  the  converse  is  not  necessarily  true.  As  we  shall  see,  there  are  many  c^es 
where  the  connections  between  shadowing  orbits  of  nearby  systems  cannot  be  descnted 
by  a  simple  homeomorphism.  The  structure  of  these  connections  can  in  fact  be  qmte 

complicated. 


3,2.1  A  function  shadowing  theorem 

There  have  been  several  recent  results  concerning  shadowing  properties  of  on^dimensional 
maps.  Among  these  include  papers  by  Coven,  Kan,  and  Yorke  17],  Nusse  and  Yorke  [39], 
and'^  Chen  [121.  This  section  extends  the  shadowing  results  of  these  papers  “  "der  ‘ 
examine  the  possibility  of  parameter  and  function  shadowing  for  parameterized  families 

of  maps  of  the  interval. 

Specifically,  we  will  deal  with  two  types  of  maps:  piecewise  monotone  mappings  and 
uniformly  piecewise-linear  mappings  of  a  compact  interval,  7  C  E  onto  itseit: 

Definitions:  A  continuous  map  /  :  7  ->  7  is  said  to  be  piecewise  monotone  if  f  h^s 
finitely  many  turning  points.  /  is  said  to  be  a  uniformly  pxecewise-hnear  mappings 
can  be  written  in  the  form; 

/(x)  =  cfi  i  sx  for  Xi  €  [ci-i,Ci] 

where  s  >  1,  c„  <  Ci  <  . . .  <  c,  and  ,  >  0  is  an  integer.  (We  assume  s  >  1  because 
otherwise  there  will  not  be  any  interesting  behavior). 

Note  that  for  this  section,  it  is  useful  to  define  neighborhoods,  B(x,  e),  so  that  they 
do  not  extend  beyond  the  confines  of  I.  In  other  words,  let  B(x,  e)  -  (x  e  x  +  e)  HI. 
With  this  in  mind,  we  use  the  following  definitions  to  describe  some  relevant  properties 

of  piecewise  monotone  maps. 

Definition:  A  piecewise  monotone  map.  /  :  /  ^  /,  is  said  to  be  transitive  if  for  any 
two  open  sets  U,V  C  I,  there  exists  an  n  >  0  such  that  r{U)  nV 
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Definitions:  Let  f  :  I  Ihe  piecewise  monotone.  Then  /  satisfies  the  linking  property 
if  for  every  c  E  C{f)  and  any  e  >  0  there  is  a  point  z  E  I  and  integer  n  >  0  such  that 
^  €  B{c,  e),  e  C{f),  and  |/*(c)  -  r{z)\  <  e  for  every  ie{l,2,...,  n).  Suppose,  in 
addition,  that  we  can  always  choose  &  z  ^  c  such  that  the  above  condition  is  satisfied. 
Then  /  is  said  to  satisfy  the  strong-linking  condition. 

We  are  now  ready  to  state  the  main  result  of  this  section. 


Theorem  3.2.1  ;  Transitive  piecewise  monotone  maps  satisfy  the  function  shadowing 
property  in  C  (/,  I)  if  and  only  if  they  satisfy  the  strong  linking  property. 


Proof:  The  proof  may  be  found  in  appendix  B. 

In  particular,  this  theorem  implies  the  following  parameter  shadowing  result.  Let 
/p  C  E  be  a  closed  interval  of  the  real  line.  Suppose  that  {/p  :  I  /|p  €  Ip}  is  a 
continuously  parameterized  family  of  one-dimensional  maps,  and  let  /p^  be  a  transitive 
piecewise  monotone  mapping  with  the  strong  linking  property.  Then  fp  must  have  the 
parameter  shadowing  property  at  p  =  po.  Note  that  fp^  is  certainly  not  structurally 
stable  in  0’(/,  J).^  The  connections  between  the  shadowing  orbits  are  not  continuous 
and  one-to-one  in  general.  In  the  next  section  we  shall  further  examine  what  these 
connections  are  likely  to  look  like. 

Now,  however,  we  would  like  to  present  some  motivation  for  why  theorem  3.2.1 
makes  sense.  The  key  to  examining  the  shadowing  properties  of  transitive  piecewise 
monotone  maps  is  to  understand  the  dynamics  near  the  turning  points.  In  regions  away 
from  the  turning  points,  these  maps  look  locally  hyperbolic,  so  finite  pieces  of  orbits 
in  these  regions  shadow  each  other  rather  easily.  The  transitivity  condition  guarantees 
hyperbolicity  away  from  the  turning  points,  since  any  transitive  piecewise  monotone 
maps  is  topologically  conjugate  to  a  uniformly  piecewise  linear  map. 

Close  to  the  turning  points,  however,  things  are  more  interesting.  Suppose,  for 
example,  that  we  are  given  a  family  of  piecewise  monotone  maps  /p  :  /  and 

suppose  that  we  would  like  to  find  parameter  shadowing  orbits  for  orbits  of  fp^  that  pass 
near  a  turning  point,  c,  of  fp^.  Consider  a  neighborhood,  U  C  I  around  the  turning  point 
c.  Regions  of  state  space  near  c  are  folded  on  top  of  each  other  by  fp^  (see  figure  3.1(a)). 
This  can  create  problems  for  parameter  shadowing.  Consider  what  the  images  of  U  look 
like  under  repeated  applications  of  fp^  compared  to  what  they  might  look  like  for  two 
other  parameter  values  (p_  and  p+)  close  to  po  (see  figure  3.1(b)).  Under  the  different 
parameter  values,  the  forward  images  of  U  become  offset  from  each  other,  since  orbits 
for  parameter  values  near  po  look  like  pseudo-orbits  of  fp^ . 

^In  fact,  no  map  is  structurally  stable  in  C^{I,  I).  This  is  clear,  since  any  €P{I,  I)  neighborhood  of 
/  G  Cf*  (/,  I)  contains  maps  with  arbitrary  numbers  of  turning  points.  Since  turning  points  are  preserved 
by  topological  conjugacy,  /  cannot  be  structurally  stable  in  C?’(/,  I). 
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The  forward  images  of  U  for  different  parameter  values  tend  to  consistently  either 
lag  or  lead  each  other,  a  phenomenon  which  has  interesting  consequences  for  parameter 
shadowing.  For  example,  in  figure  3.1(b),  since  fpl{U)  lags  it  appears  that  fp_ 

has  a  difficult  time  shadowing  the  orbit  of  fp^  emanating  from  the  turning  point,  c.  On 
the  other  hand,  from  the  same  figure,  there  is  no  reason  to  expect  that  there  are  any 
orbits  of  /po  which  are  not  shadowed  by  suitable  orbits  of  fp^ . 

However,  this  is  not  the  end  of  the  story.  If  the  linking  condition  is  satisfied,  then 
the  turning  points  are  recurrent  and  neighborhoods  of  turning  points  keep  returning  to 
turning  points  to  get  refolded  on  top  of  themselves.  This  allows  the  orbits  of  lagging 
parameter  values  to  catch  up  as  regions  get  folded  back  (see  figure  3.1(c)).  In  this  case, 
we  see  that  the  forward  image  of  U  under  /p^  gets  folded  back  into  the  the  corresponding 
forward  image  of  U  under  /p_,  thus  allowing  orbits  of  /p_  to  effectively  shadow  orbits 

of  /po- 

On  the  other  hand  we  see  that  there  is  an  asymmetry  in  the  shadowing  behavior  of 
parameter  values  depending  on  whether  the  folded  regions  around  turning  point  lag  or 
lead  one  another  under  the  action  of  different  parameter  values.  The  parameter  values 
that  lag  seem  to  have  a  more  difficult  time  shadowing  other  orbits  than  the  ones  that  lead. 
Making  this  statement  more  precise  is  the  subject  of  the  next  section.  Theorem  3.2.1 
merely  states  that  if  the  strong  linking  condition  is  satisfied,  then  regions  near  turning 
points  are  refolded  back  upon  one  another  in  such  a  way  that  the  parameter  shadowing 
property  is  satisfied. 


3.2.2  An  example:  the  tent  map 


In  [12],  Chen  proves  the  following  theorem: 

Theorem  3.2.2  The  pseudo-orbit  shadowing  property  and  the  linking  property  are  equiv¬ 
alent  for  transitive  piecewise  monotone  maps. 


One  interesting  thing  to  note  is  the  difference  between  function  shadowing  and 
pseudo-orbit  shadowing.  For  instance,  what  happens  when  a  transitive  map  exhibits 
the  linking  property  but  does  not  satisfy  the  strong-linking  property?  We  already  know 
that  such  maps  must  exhibit  the  pseudo-orbit  shadowing  property  but  must  not  satisfy 
the  function  shadowing  property  on  0^(7, /).  It  is  worth  a  brief  look  at  why  this  occurs. 


As  an 
where: 


illustrative  example,  consider  the  family  of  tent  maps. 


fv{^) 


{px\i  X  <\ 
p{l  -  x)  if  X  >  I 


u 


[0,1] 


[0,1], 


for  p  €  [0, 2].  Pick  Po  6  (\/2, 2)  such  that  =  |.  It  is  not  difficult  to  show  that  such  a 

Po  exists.  Numerically  we  find  that  one  such  value  for  po  occurs  near  po  ~  1.5128763969. 
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Figure  3.1:  Figure  3.1(a)  illustrates  how  neighborhoods  near  a  turning  point  get  folded,  (b) 
shows  what  might  happen  for  three  different  parameter  values,  p_  <  <  p^.  The  images 

of  neighborhoods  near  the  critical  point  tend  to  get  offset  each  from  other  so  that  the  neigh¬ 
borhoods  for  certain  parameters  (eg.,  p+)  may  begin  to  lead  while  other  parameters  (eg.,  p_) 
lag  behind.  Lagging  parameters  have  difficulty  shadowing  leading  parameters,  (c)  shows  how 
neighborhoods  can  get  refolded  on  each  other  as  a  result  of  a  subsequent  encounter  with  a 
turning  point,  allowing  lagging  parameters  to  “catch  up,”  so  that  they  are  able  to  shadow 
parameter  values  that  normally  lead. 
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We  can  see  that  fp^  is  transitive  on  the  interval  I{pq)  —  [fpo{c)^  fpoi^)]  where  in  this 
case,  c=\.  Given  any  interval,  U  C  I{po),  since  po  >  ifc^U  then  l/po(t^)|  >  y/^\U\ 
and  if  c  e  f/  then  \fp,{U)\  >  fit/],  where  \U\  denotes  the  length  of  the  interval  U.  Thus 
either \f^(U)\  >  2\U\  or  =  /(po),  and  for  any  U  C  I{po)  there  exists  a  A:  >  0  such 

that  f  (U)  =  I{po).  Consequently,  /  must  be  transitive  on  /.  Note  that  even  though 
I{p)  is^not  invariant  with  respect  to  p,  theorem  3.2.1  still  applies,  since  we  could  easily 
rescale  the  coordinates  to  eliminate  this  problem. 

Now  let  po  be  near  1.5128763969  so  that  4„(c)  =  c  =  |.  We  would  like  to  investigate 
the  shadowing  properties  of  the  orbit,  {/po(c)}^o*  Let  /(x,p)  =  fp{x).  Two  important 
pieces  of  information  are  the  following: 

D,f{c,po)  =  ^ -i.msm 

0-5(c,Po)  =  -1 

where  we  define; 

f  1  if  c  is  a  relative  maximum  of  /* 
crt(c,p)  =  I  -1  if  c  is  a  relative  minimum  of /* 

As  we  shall  see  in  the  next  section,  statistics  like  (3.2)  and  (3.3)  are  important 
references  in  evaluating  the  shadowing  behavior  for  families  of  maps.  For  this  example, 
let  us  consider  a  combined  state  and  parameter  space  and  examine  how  a  small  square 
in  this  space  around  (a;,p)  =  (c,po)  gets  iterated  by  the  map  /.  We  see  that  because  /p^ 
has  a  relative  minimum  at  c  =  ^  and  because  DpP{c,po)  is  negative,  parameter  values 
higher  than  po  tend  to  lead  while  parameter  values  less  than  po  tend  to  lag  behind  in  the 
manner  described  earlier  in  this  section.  Since  the  turning  point  of  /po  at  c  is  periodic 
with  period  5,  this  type  of  lead/lag  behavior  continues  for  arbitrarily  many  iterates. 

We  want  to  know  if  nearby  maps,  /p,  for  p  near  po  have  orbits  that  shadow  {/^o(c)}feo- 
Consider  how  the  lead/lag  behavior  affects  possible  shadowing  orbits.  Because  c  =  ^  is 
periodic,  it  is  possible  to  verify  that  the  quantity,  [<7„(c,po  )T>p/"(c,Po  )]?  grows  exponen¬ 
tially  as  n  gets  large  (where  Po  indicates  that  we  evaluate  the  derivative  for  p  arbitrarily 
close  to,  but  less  than  po).  Thus  for  maps  with  parameter  values  p  <  po,  all  possible 
shadowing  orbits  diverge  away  from  {f^Mk=o  a  rate  that  depends  exponentially  on 
the  number  of  iterates.  Consequently  there  exists  a  6  >  0  such  that  if  p  e  (po  <5,Po), 
then  no  orbit  of  /p  e-shadows  any  e  >  0  sufficiently  small.  On  the  other 

hand  the  orbit  can  be  shadowed  by  fp  for  parameter  values  p  >  po-  In  fact, 

because  everything  is  linear,  it  is  not  difficult  to  show  that  there  must  exist  a  constant 
K  >  0  such  that  that  for  any  e  >  0,  there  is  an  orbit  of  fp  that  e-shadows  {fpo{c)}k=o 

if  p  e  IpoiPo  +  Kt]. 

In  summary,  we  see  that  the  orbit,  cannot  be  shadowed  by  parameter 

values  p  <  Po,  but  can  be  shadowed  for  parameter  values  p  >  Po-  fpo  satisfies  the 


(3.2) 

(3.3) 
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linking  but  not  the  strong  linking  property.  Thus  fp^  satisfies  the  pseudo-orbit  shadowing 
property,  and  any  orbit  of  fp  for  p  near  po  can  be  shadowed  by  an.  orbit  of  fp^ .  On  the 
other  hand,  fp^  does  not  satisfy  function  or  parameter  shadowing  properties,  since  not  all 
nearby  systems  (for  example,  fp  for  p  <  po)  have  orbits  that  shadow  orbits  of  fp^.  Also, 
note  how  the  lead  and  lag  behavior  in  parameter  space  results  naturally  in  asymmetrical 
shadowing  properties  in  parameter  space.  We  will  look  at  this  more  closely  in  the  next 
section. 

As  a  final  note  and  preview  for  the  next  section,  consider  briefly  how  the  above  ex¬ 
ample  might  generalize  to  other  situations.  The  tent  map  example  may  be  considered 
exceptional  for  two  primary  reasons:  (1)  the  tent  map  is  uniformly  hyperbolic  every¬ 
where  except  for  at  the  turning  point,  and  (2)  the  turning  point  of  fp^  is  periodic.  We 
are  generally  interested  in  more  generic  situations  involving  parameterized  families  of 
piecewise  monotone  maps,  especially  maps  with  positive  Lyapunov  exponents.  Appar¬ 
ently  a  number  of  likely  scenarios  also  result  in  lead/lag  behavior  in  parameter  space, 
producing  asymmetries  in  shadowing  behavior  similar  to  that  observed  in  the  tent  map 
example.  However,  this  behavior  generally  gets  distorted  by  local  geometry.  Also  things 
become  more  complicated  because  of  folding  caused  by  close  returns  to  turning  points. 
In  particular  for  maps  with  positive  Lyapunov  exponents,  shadowing  orbits  for  lagging 
parameter  values  tend  to  diverge  away  at  exponential  rates,  just  like  in  the  tent  map 
example,  but  this  only  occurs  for  a  certain  number  of  iterates  until  a  close  return  or 
linking  with  a  turning  point  occurs.  In  such  cases,  function  shadowing  properties  may 
exist,  but  the  geometry  of  the  shadowing  orbits  still  reflects  the  asymmetrical  lead/lag 
behavior.  This  behavior  certainly  affects  any  attempts  at  parameter  estimation. 


3.3  Asymmetrical  shadowing 

In  the  previous  two  sections  we  were  primarily  interested  in  topologically-oriented  re¬ 
sults  about  whether  orbits  of  nearby  one-dimensional  systems  shadow  each  other  or  not. 
However,  topological  results  really  do  not  provide  enough  information  for  us  to  draw  any 
strong  conclusions  about  the  feasibility  of  estimation  problems.  Whether  orbits  shadow 
each  other  or  not,  in  general  we  would  also  like  to  know  the  answers  to  more  specific 
questions,  for  example:  what  is  the  expected  rate  of  convergence  for  a  parameter  esti¬ 
mate,  and  how  does  the  level  of  noise  or  measurement  error  affect  the  possible  accuracy 
of  a  parameter  estimate? 

In  this  section  we  address  a  more  analytical  treatment  of  the  subject  of  shadowing 
and  parameter  dependence  in  one-dimensional  maps.  The  problem  with  this,  of  course, 
is  that  there  is  an  extremely  rich  variety  of  possible  behavior  in  parameterized  families 
of  mappings,  and  it  is  difficult  to  say  anything  concrete  without  limiting  the  statements 
to  relatively  small  classes  of  maps.  Thus  some  compromises  have  to  be  made.  However, 
we  approach  our  investigation  with  some  specific  goals  in  mind.  In  particular  we  are 
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interested  in  definite  bounds  on  how  fast  the  closest  shadowing  trajectories  in  nearby 
systems  diverge  from  each  other  and  some  explanation  concerning  how  the  observed 
asymmetrical  shadowing  behavior  gets  established  in  the  parameter  space.  We  will 
concentrate  on  smooth  maps  of  the  interval,  especially  the  quadratic  map,  fp{x)  = 
px{\  —  x). 


3.3.1  Lagging  parameters 

In  this  subsection,  we  argue  that  asymmetries  are  likely  to  occur  in  parameter  space.  In 
particular,  given  a  smooth  piecewise  monotone  map  with  a  positive  Lyapunov  exponent, 
shadowing  orbits  for  nearby  lagging  maps  tend  to  diverge  away  from  orbits  of  the  original 
system  at  an  exponential  rate  before  being  folded  back  by  close  encounters  with  turning 

points. 

Preliminaries 

We  will  primarily  restrict  ourselves  to  maps  with  the  following  properties: 

(CO)  g  :  I  I,is  piecewise  monotone. 

(Cl)  g  is  C'^  on  I. 

(C2)  Let  C{g)  be  the  finite  set  such  that  c  €  C{g)  if  and  only  if  5^  has  a  local  extremum 
at  c  €  /.  Then  g"{c)  ^  0  if  c  €  Cig)  and  g'ix)  ^  0  for  all  a;  G  /  \  C{g). 

We  are  also  interested  in  maps  that  have  positive  Lyapunov  exponents.  In  particular, 
we  will  examine  maps  satisfying  a  set  of  closely  related  properties  known  as  the  Collet- 
Eckmann  conditions,  (CEl)  and  (CE2).  We  will  say  that  a  map  g  satisfies  (CEl)  or 
(CE2),  if  there  exist  constants  Ke  >  0  and  A^;  >  1  such  that  for  some  c  G  C{g). 

(CEl)  \Dg^{g{c))\  >  KeXh 
(CE2)  \Dg^{z)\  >  KeXe  5”(^)  =  c- 

respectively  for  any  n  >  0. 

We  also  consider  one-parameter  families  of  mappings,  fp  :  h Ix,  parameterized  by 
p  G  Ip,  where  4  C  R  and  /p  C  E  are  closed  intervals  of  the  real  line.  Let  f{x,  p)  =  fp{x) 
where’/  :  h  x  Ip 4.  We  are  primarily  interested  in  one-parameter  families  of  maps 
with  the  following  characteristics: 

(DO)  For  each  p  G  Ip,  fp  ■■  Ix -*  Ix  satisfies  (CO)  and  (Cl).  We  also  require  that  C{fp) 
remains  invariant  with  respect  to  p  for  all  p  £.  Ip. 


(Dl)  /  :  4  X  4  ^  4  is  for  all  {x,p)  €  4  x  4- 

Note  that  the  following  notation  will  be  used  to  express  derivatives  of  /(x,  p)  with  respect 
to  X  and  p. 

DJix,p)  =  (3-4) 

Dpfix,p)  =  (3-5) 

The  Collet-Eckmann  conditions  specify  that  derivatives  with  respect  to  the  state, 
X,  grows  exponentially.  Similarly  we  will  also  be  interested  in  families  of  maps  where 
derivatives  with  respect  to  the  parameter,  p,  also  grow  exponentially.  In  other  words, 
we  require  that  there  exist  constants  >  0,  Ap  >  1,  and  N  >  0  such  that  for  some 
po  €  Ip,  and  c  €  C{fpa): 


(CPI)  \Dpr{c,po)\>Kp\; 


for  all  n  >  N.  From  now  on,  given  a  parameterized  family  of  maps,  {/p|p  €  4)5 
say  that  /p^  satisfies  (CPI)  if  the  above  condition  holds. 

This  may  seem  to  be  a  rather  strong  constraint,  but  in  practice  it  often  follows 
whenever  (CEl)  holds.  We  can  see  this  by  expanding  with  the  chain  rule: 

D,r{c,Po)  =  DJ{r-'{c,p„),po)D,r-'{c,Po)  +  £>,/(/"■’ (c,Po),Po)  (3.6) 

to  obtain  the  formula  for  Z)p/”(x,po)  : 

I>p/"(x,po)  =  -Dp/(/”"^(c,Po),Po) +  ^[T>p/(/‘ (c,po),Po)  n  ^^/{/''(c,Po),Po)]. 

i=0  j=j+l 

Thus,  if  |i^a;/"(/(c,po),Po)|  gTows  exponentially,  we  expect  |Z)p/"(x,po)|  to  also  grow 
exponentially  unless  the  parameter  dependence  is  degenerate  in  some  way  (eg,  if  /(x,p) 
is  independent  of  p). 

Now  for  any  c  G  C'(/po),  define  crn{c,p)  recursively  as  follows: 

cr„+i(c,p)  =  5pn{4^/(/”(c,p),p)}(T„(c,p)  (3.7) 

where 

,  ._riifcisa  relative  maximum  of  /p 
i  -1  if  c  is  a  relative  minimum  of  fp 

Basically  (j„(c,p)  =  1  if  /”  has  a  relative  maximum  at  c  and  an{c,p)  =  —  1  if  /p  has  a 
relative  minimum  at  c.  We  can  use  this  notion  to  distinguish  a  one  direction  in  parameter 
space  from  the  other. 


41 


Definition:  Let  {fp  :  4  4b  €  Ip}  be  a  one-parameter  family  of  mappings  satisfying 
(DO)  and  (Dl).  Suppose  that  there  exists  po  €  Ip  such  that  /p^  satisfies  (CEl)  and 
(CPI)  for  some  c  €  C(/po).  Then  we  say  that  the  turning  point,  c,  of  fp^  favors  higher 
parameters  if  there  exists  N'  >  0  such  that 

sgn{Dpnc,po)}  =  ^^n(c,p)  (3-8) 

for  all  n  >  N'.  Similarly,  the  turning  point,  c,  of  fp^  favors  lower  parameters  if 

sgn{DpP{c,po)}  =  -cr„(c,p)  (3.9) 

for  all  n  >  N'. 

The  first  thing  to  notice  about  these  two  definitions  is  that  they  are  exhaustive  if 
(CPI)  is  satisfied.  That  is,  if  (CPI)  is  satisfied  for  some  po  G  Ip  and  c  G  C{fpa),  then 
the  turning  point,  c,  of  fp^  either  favors  higher  parameters  or  favors  lower  parameters. 
We  can  see  this  from  (3.6).  Since  \Dpf{x,po)\  is  bounded  for  x  G  4,  if  IDpPiXjPo)] 
grows  large  enough  then  its  sign  is  dominated  by  the  signs  of  Dxf{f  (c,  Po)?Po)  and 
Dp/”"^(c,po),  so  that  either  (3.8)  or  (3.9)  must  be  satisfied. 

Finally,  if  po  €  Ip  and  c  G  C(/po),  then  for  any  e  >  0,  define  ne(c,e,po)  to  be  the 
smallest  integer  n  >  1  such  that  |/"(c,po)  -  c.|  <  e  for  any  c*  G  C{fpo).  We  say  that 
ne(c,  e,po)  =  oo  if  no  such  n  >  1  exists. 

Main  result 

We  are  now  ready  to  state  main  results  of  this  subsection. 

Theorem  3.3.1  Let  {fp  :  4  ->  4b  ^  4)  «  one-parameter  family  of  mappings 

satisfying  (DO)  and  (Dl).  Suppose  that  (CPI)  is  satisfied  for  some  po  G  Ip  and  c  G 
C{fpo).  Suppose  further  that  fp,  satisfies  (CEl)  at  c,  and  that  the  turning  point,  c,  favors 
higher  parameters  under  fp,.  Then  there  exists  Sp  >  0,  \  >  I,  K  >0,  and  K  >1,  such 
that  ifpe  {po  -  Sp,Pol  then  for  any  e  >  0,  the  orbit  {f;,{c)}^=o  is  not  e-shadowed  by 

any  orbit  of  fp  iflp  —  pol  >  K'e\ 

The  analogous  result  also  holds  if  fp,  favors  lower  parameters. 

Proof:  The  proof  of  this  result  can  be  found  in  appendix  C. 

The  proof  is  actually  relatively  straightforward,  although  the  details  of  the  analysis 
becomes  a  bit  tedious.  The  basic  idea  is  that  away  from  the  turning  points,  everything  is 
hyperbolic,  and  we  can  uniformly  bound  derivatives  with  respect  to  state  and  parameters 
to  grow  at  an  exponential  rate.  In  particular,  the  lagging  behavior  for  lower  parameters 
is  preserved  and  becomes  exponentially  more  pronounced  with  increasing  numbers  of 
iterates.  Shadowing  orbits  for  parameters  p  <  Po  diverge  away  exponentially  fast  if 
higher  parameters  are  favored.  However,  this  only  works  for  orbits  that  don  t  return 
closely  to  the  turning  points  where  derivatives  are  small. 
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3.3.2  Leading  parameters 

Motivation 

We  have  shown  in  the  previous  section  that  li  f  :  x  Ij,  is  a,  one  parameter 
family  of  maps  of  the  interval  and  if  there  exists  N  >  0  such  that 

D,ric,po)  >  ar.ic,po)Kr  (3.10) 

for  all  n  >  N,  then  for  p  <  po,  orbits  of  fp  tend  to  diverge  at  an  exponential  rate  away 
from  orbits  of  fp^  that  pass  near  the  turning  point,  c.  Such  orbits  of  fp^  can  only  be 
shadowed  by  orbits  of  fp  for  p  <  po  if  the  orbits  of  fp^  are  folded  back  upon  themselves 
by  a  subsequent  encounter  with  the  turning  point. 

On  the  other  hand,  we  would  like  to  find  a  condition  like  (3.10)  under  which  orbits 
of  fp  for  p  >  Po,  can  shadow  any  orbit  of  fp^  indefinitely  without  relying  on  folding. 
This  type  of  phenomenon  is  indicated  by  numerical  experiments  on  a  variety  of  systems. 
Unfortunately  however,  the  derivative  condition  in  (3.10)  is  local,  so  we  have  little  con¬ 
trol  over  the  long  term  behavior  of  orbits.  Thus,  we  must  replace  this  condition  with 
something  that  acts  over  an  interval  in  parameter  space. 

For  instance,  we  are  interested  in  addressing  systems  like  the  family  of  quadratic 
maps: 


f{x,p)  =  px{l  -  x).  (3.11) 

It  is  known  that  the  family  of  quadratic  maps  in  (3.11)  satisfies  a  property  known  as 
the  monotonicity  of  kneading  invariants  in  the  parameter  space  of  fp.  This  condition 
is  sufficient  to  make  one  direction  in  parameter  space  preferred  over  the  other.  We 
show  in  this  subsection  that  monotonicity  of  kneading  invariant  along  with  (CEl)  is 
suflBicient  to  guarantee  strong  shadowing  effects  for  parameters  that  lead,  at  least  in 
the  case  of  unimodal  (one  turning  point)  maps  with  negative  Schwarzian  derivative,  a 
class  of  maps  that  include  (3.11).  Maps  with  negative  Schwarzian  derivative  have  been 
the  focal  point  of  considerable  research  over  the  last  several  years,  since  they  represent 
some  of  the  simplest  smooth  maps  which  have  interesting  dynamical  properties.  We 
take  advantage  of  analytical  tools  developed  recently  in  order  to  analyze  the  relevant 
shadowing  properties. 

Definitions  and  statement  of  results 

Definition:  Suppose  that  g  :  I  I  is  and  /  C  IR.  Then  the  Schwarzian  derivative, 
Sg,  oi  g  is  given  by  the  following: 

>  g'{x) 

where  g'{x),g"{x),g"'[x)  here  indicate  the  first,  second,  and  third  derivatives  of  x. 


43 


In  this  section  we  will  primarily  restrict  ourselves  to  mappings  with  the  following 
properties: 

(AO)  :/—>/,  is  C^{I)  where  /  =  [0, 1],  with  ^(O)  =  0  and  5f(l)  =  0. 

(Al)  g  has  one  local  maximum  at  a:  =  c;  is  strictly  increasing  on  [0,c]  and  strictly 
decreasing  on  [c,  1]; 

(A2)  ,''(c)  <  0.  |s'(0)|  >  1. 

(A3)  The  Schwarzian  derivative  of  g  is  negative,  Sg{x)  <  0,  over  all  a:  G  /  (we  allow 
Sg{x)  =  -oo). 

Again  we  will  be  investigating  one-parameter  families  of  mappings,  f  :  h  x  ^x, 
where  p  is  the  parameter  and  Ix,Ip  C  K  are  closed  intervals.  Let  fp{x)  =  /(a:,  p)  where 
We  are  primarily  be  interested  in  one-parameter  families  of  maps  with  the 

following  characteristics: 

(BO)  For  each  p  €  /p,  fp  :  I. 4  satisfies  (AO),  (Al),  (A2),  and  (A3)  where  4  =  [0,  !]• 
For  each  p,  we  also  require  that  fp  has  a  turning  point  at  c,  where  c  is  constant 
with  respect  to  p. 

(Bl)  f  :  IxX  Ip-*  4  is  for  all  (a:,p)  G  4  x  Ip. 

Another  concept  we  shall  need  is  that  of  the  kneading  invariant.  Kneading  invariants 
and  many  associated  topics  are  discussed  in  Milnor  and  Thurston  [34]. 

Definition:  If  p  :  /  — >  /  is  a  piecewise  monotone  map  with  exactly  one  turning  point 
at  c,  then  the  kneading  invariant,  D{g,t),  of  g  is  defined  as  follows: 

D{g,  t)  =  I  +  0i{g)t  +  &2{g)t  +  .  •  •  +  ^n(p)^"  +  .  •  • 


where 

Onig)  =  ei(p)e2(p) . .  -  e„(p) 
e„(p)  =  lini  spra(Dp(p”(x))) 

for  n  >  1.  If  c  is  a  relative  maximum  of  g,  then  one  interpretation  of  Onig)  is  that  it 
represents  whether  has  a  relative  maximum  {6n{g)  =  +1)  or  minimum  {0n{g)  =  -1) 
at  c. 

We  can  also  order  these  kneading  invariants  in  the  following  way.  We  will  say  that 
\D{g,t)\  <  \D{h,t)\  if  0i{g)  =  Oiih),  for  1  <  i  <  n,  but  ^^p)  <  6n{h).  A  kneading 
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invariant,  D{fp,t),  is  said  to  be  monotonically  decreasing  with  respect  to  p  if  pi  >  po 
implies  \D{fp„t)\  <  \D(fp„,t)\. 

We  are  now  ready  to  state  the  main  result  of  this  subsection: 

Theorem  3.3.2  Let  {fp  :  Ix  — >  Ix\p  €  Ip}  be  a  one-parameter  family  of  mappings 
satisfying  (BO)  and  (Bl).  Suppose  that  po  €  Ip  such  that  fp^  satisfies  (CEl).  Also, 
suppose  that  the  kneading  invariant,  D{fp,t),  is  monotonically  decreasing  with  respect 
to  p  in  some  neighborhood  of  p  =  po.  Then  there  exists  6p  >  0  and  C  >  0  such  that  for 
every  xq  €  Ix  there  is  a  set,  W{xq)  C  /j:  x  Ip,  satisfying  the  following  conditions: 

(1)  W{xq)  =  {{(Xxoit),  /Ixo{t))\t  €  [0,1]}  whereocxo  :  [0,1]  4  and  ^xo  ■  [0,1]  ^  4 

continuous  and  0xo{t)  is  monotonically  increasing  with  respect  to  t  with  ^xo{fl)  =  Po 
and  /?xo(l)  =PQ  +  Sp. 

(2)  For  any  xq  e  Ix,  if{x,p)  €  W{xq)  then  |/”(a:,p)  -Pixo,Po)\  <  C{p-po)^  for  all 
n  >  0. 

Proof:  See  appendix  D 

Corollary  3.3.1  Let  {fp  :  Ix  — ^  4b  ^  41  ^  one-parameter  family  of  mappings 

satisfying  (BO)  and  (Bl).  Suppose  that  po  G  4  /po  satisfies  (CEl).  Also, 

suppose  that  the  kneading  invariant,  D{fp,t),  is  monotonically  decreasing  with  respect 
to  p  in  some  neighborhood  of  p  =  po.  Then  there  exists  6p  >  0  and  C  >  0  such  that  if 
p  G  \po,Po  +  bp],  then  for  any  e  >  0,  every  orbit  of  fp^  is  e-shadowed  by  an  orbit  of  fp  if 
\p-po\<Ce^. 

Proof:  This  is  an  immediate  consequence  of  theorem  3.3.2. 

Overview  of  proof 

We  now  outline  some  of  the  ideas  behind  the  proof  of  theorem  3.3.2.  The  proof 
depends  on  an  examination  of  the  structure  of  the  preimages  of  the  turning  point,  x  =  c, 
in  the  combined  space  of  state  and  parameters  (4  x  4  space).  The  basic  idea  is  to  find 
connected  shadowing  sets  in  state-parameter  space.  These  sets  have  the  property  that 
points  in  the  set  shadow  each  other  under  arbitrarily  many  applications  of  /.  Certain 
geometrical  properties  of  these  sets  can  be  determined  by  squeezing  the  sets  between 
structures  of  preimage  points.  In  order  to  discuss  the  approach  further,  we  first  need  to 
introduce  some  notation. 

We  consider  the  set  of  preimages,  P{n)  C  Ix  ><■  Ip  satisfying: 

P{n)  =  {{x,p)\r{x,p)  =  c  for  some  0  <  i  <  n}. 
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It  is  also  useful  to  have  a  way  of  specifying  a  particular  section  of  path-connected  preim¬ 
ages,  R{n,xo,PQ)  C  P{n),  extending  from  a  single  point,  (xo,po)  €  P{n).  Let  us  define 
R{n,xo,Po)  so  that  {x',p')  G  R{n,xo,Po)  if  and  only  if  (x',p')  G  P(n)  and  there  exists  a 
continuous  function,  g  :  Ip  /x,  such  that  g{po)  =  xq,  g{p  )  —  ^  ■>  and 

{(x,p)Ix  =  g{p),p  G  \po;p']}  C  P(n), 

where  [po;p']  may  denote  either  \po,p']  or  [p',po],  whichever  is  appropriate. 

The  first  step  is  to  investigate  the  basic  structure  of  P{n).  We  show  that  P{n) 
contains  no  regions  or  interior  points  and  that  P{n)  cannot  contain  any  isolated  points 
or  curve  segments.  Instead,  each  point  in  P{n)  must  be  part  of  a  continuous  curve 
that  stretches  for  the  length  of  the  parameter  space.  Ip.  In  fact,  if  (xo,j/o)  €  P{n),  then 
R{n,xo,po)  n  (4  X  {sup  Ip})  ^  0  and  R{n,xo,po)  n  (4  x  {inf /p})  ^  0. 

The  next  step  is  to  demonstrate  that  if  the  kneading  invariant  of  /p,  D(fp,t),  is 
monotonically  decreasing  (or  increasing),  then  P{n)  has  a  special  topology.  It  must 
take  on  a  tree-like  structure  so  that  as  we  travel  along  one  direction  in  parameter  space, 
branches  of  P(n)  must  either  always  merge  or  always  split  away  from  each  other.  For 
example  if  D{fp,t)  is  monotonically  decreasing,  then  branches  of  P{n)  can  only  split 
away  from  each  other  as  we  increase  the  parameter  p.  In  other  words,  R{n,y-,po)  and 
R{n,y+,po)  do  not  intersect  each  other  in  the  space,  h  x  {p},  for  for  p  >  Po  if  2/+  ^  V- 
and  y+,y-  €  h- 

Now  suppose  we  want  to  examine  the  points  that  shadow  (xo,  po)  under  the  action 
of  /  given  any  Xo  G  Ix-  We  first  develop  bounds  on  derivatives  for  differentiable  sections 
of  i2(n,  x,po).  We  then  use  knowledge  about  the  behavior  of  R{n,x,po)  to  bound  the 
behavior  of  the  shadowing  points.  We  demonstrate  that  for  maps,  /p,  with  kneading 
invariants  that  decrease  monotonically  in  parameter  space,  there  exist  constants  C  >  0 
and  ^p  >  0  such  that  if  xo  G  h  and 

U{p)  -  {x\\x  -  xo\  <  C{p  -  po)'^}  (3.12) 

for  any  p  e  Ip,  then  for  any  p'  G  [po,Po  +  ^p],  there  exists  x'+  G  U{p')  such  that 
(xi ,p')  G  R(n+,y+,po)  for  some  y+  >  xq  and  n+  >  0  assuming  that  /”+(t/+,po)  =  c. 
Likewise  there  exists  x'^  G  U{p')  such  that  (x'_,p')  G  R{n.,y.,Po)  for  some  y_  <  Xo  and 
n_  >  0  where  f'^~{y-,Po)  =  c. 

However,  setting  n  =  max{n+,n_},  since  R{n,y-,po)  and  R{n,y+,po)  do  not  in¬ 
tersect  each  other  for  p  >  po  and  y-  ^  y+,  then  we  also  know  that  for  any  y_  <  y+, 
there  is  a  region  in  4  x  Ip  space  bounded  by  R{n,y-,po),  R{n,y+,Po),  and  p  >  po- 
Take  the  limit  of  this  region  as  y-  ->  Xq,  y+  ^-nd  n  ->  oo.  Call  the  resulting 

region  S{xo).  We  observe  that  ^(xo)  is  a  connected  set  that  is  invariant  under  /  and 
is  nonempty  for  every  parameter  value  p  G  4  such  that  p  >  po  (by  invariant  we  mean 
that  /(5'(xo))  =  5'(/(xo,po))-  Thus,  since  S{xo)  is  bounded  by  (3.12),  there  exists  a  set 
of  points,  S'(xo),  in  combined  state  and  parameter  space  that  shadow  any  trajectory. 
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{/po(®o)i’^o  fpo-  Finally  we  observe  that  there  exists  a  subset  of  5'(a;o)  that  can  be 
represented  by  the  form  given  for  M/’(a:o). 


3.4  Example:  quadratic  map 

In  this  section  we  examine  how  the  results  of  section  3.3  apply  to  the  quadratic  map, 
fp  :  [0, 1]  [0, 1],  where: 


fp{x)  =  px{l  -  x)  (3.13) 

and  p  G  [0,4].  For  the  rest  of  this  section,  fp  will  refer  to  the  map  given  in  (3.13),  and 
f(x,p)  =  fp{x)  for  any  (rc,p)  €  4  x  /p  where  4  =  [0, 1]  and  Ip  =  [0,4]. 

We  have  already  seen  in  conjecture  3.1.1,  that  there  appears  to  be  dense  set  of  pa¬ 
rameters  in  Ip  for  which  fp  is  structurally  stable  and  has  a  hyperbolic  periodic  attractor. 
However,  by  the  following  result,  we  find  that  there  is  also  a  large  set  of  parameters  for 
which  fp  satisfies  the  Collet-Eckmann  conditions  and  is  not  structurally  stable: 


Theorem  3.4.1  Let  E  be  the  set  of  parameter  values,  p,  such  that  (CEl)  is  satisfied  for 
the  family  of  quadratic  maps,  fp,  given  in  (8.13).  Then  E  is  a  set  of  positive  Lebesgue 
measure.  Specifically,  E  has  a  density  point  at  p  =  A  so  that: 


lim^(^‘^[4-e,4]) 

£-♦0  e 


=  1. 


where  \{S)  represents  the  Lebesgue  measure  of  the  set  S. 


(3.14) 


Proof:  The  first  proof  of  this  result  was  given  in  [5].  The  reader  should  also  consult 
the  proof  given  in  [33].^ 

Apparently,  if  we  pick  a  parameter,  po,  at  random  from  Ip  (with  uniform  distribution 
on  Ip)  there  is  a  positive  probability  that  fp^  will  satisfy  (CEl).  We  might  note  that 
numerical  evidence  suggests  that  the  set  of  parameters,  p,  resulting  in  maps,  fp,  which 
satisfy  (CEl)  are  not  just  concentrated  in  a  small  neighborhood  of  p  =  4. 

In  any  case,  applying  the  results  of  the  last  section,  we  see  that  for  a  positive  measure 
of  parameter  values,  there  is  a  definite  asymmetry  with  respect  to  shadowing  results  in 
parameter  space.  The  following  theorem  illustrates  this  fact. 

^These  two  references  actually  deal  with  the  family  of  maps,  gaix)  =  1  —  ax^,  where  a  is  the 
parameter.  However,  the  maps  ga  and  fp  are  topologically  conjugate  if  a  =  —  2p.  The  conjugating 

homeomorphism  in  this  case  is  simply  a  linear  function.  Thus  the  results  in  the  references  immediately 
apply  to  the  family  of  quadratic  maps,  /p  :  /*  — >  Ix  for  p  £  Ip. 
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Theorem  3.4.2  Let  4  =  [0,4],  h  =  [0, 1],  and  U  :  h h  he  the  family  of  quadratic 
maps  such  that  fp{x)  =  px{\  -  x)  for  p  €  Ip.  Then  there  exist  constants  <5  >  0,  C7  >  0, 
K  >  0,  and  set  E{'y)  C  Ip  with  positive  Lebesgue  measure  for  every  7  >  1  such  that: 

(1)  7/7  >  1  andpo  G  -^(7),  then  fp^  satisfies  (CEl). 

(2)  If  fp^  satisfies  (CEl),  then  for  any  e>0  sufficiently  small,  any  orbit  of  fp^  can  be 
e— shadowed  by  an  orbit  of  fp  for  p  €  [po,Po  +  C^]. 

(3)  7/  7  >  1  and  po  €  £(7),  then  for  any  e  >  0,  almost  no  orbits  of  fp^  can  be 
e— shadowed  by  any  orbit  of  fp  forp  €  (po~^jPo'~(77e)"’').  That  is,  the  set  of  possible 
initial  conditions,  xq  €  7^,  such  that  the  orbit  {/po(3^o)}^o  he  e— shadowed 
by  some  orbit  of  fp  comprises  at  most  a  set  of  Lebesgue  measure  zero  on  7^  if 

p  e  {po  -  h,po  -  (Key). 

Proof  of  theorem  3.4.2:  The  full  proof  for  this  result  can  be  found  in  appendix  E. 

Before  we  take  a  look  at  an  overview  of  the  proof  for  theorem  3.4.2,  it  is  useful  to 
make  a  few  remarks.  First  of  all,  one  might  wonder  whether  the  asymmetrical  situation 
in  theorem  3.4.2  is  really  generic  for  all  po  €  Ip  such  that  /po  satisfies  (CEl).  For 
example,  are  there  other  parameter  values  in  Ip  for  which  it  is  easier  to  shadow  lower 
parameter  values  than  it  is  to  shadow  higher  parameter  values?  Numerical  evidence 
indicates  that  most  if  not  all  p  e  Ip  exhibit  asymmetrical  shadowing  properties  if  fp  has 
positive  Lyapunov  exponents.  Furthermore,  it  seems  that  these  parameter  values  favor 
the  same  specific  direction  in  parameter  space.  In  fact  it  is  easy  to  show  analytically 
that  condition  (2)  of  theorem  3.4.2  actually  holds  for  all  po  G  Ip  for  which  fp„  satisfies 
(CEl).  In  other  words,  for  fp^  satisfying  (CEl),  there  exists  C  >  0  such  that  for  any 
e  >  0  sufficiently  small,  fp^  can  be  e-shadowed  by  an  orbit  of  /p  if  p  G  [po,Po  +  Ce^j. 

We  now  outline  the  strategy  for  the  proof  of  theorem  3.4.2.  For  parts  (1)  and 
(3)  we  basically  want  to  combine  theorem  3.3.1  and  theorem  3.4.1  in  the  appropriate 
way.  There  are  four  major  steps.  We  first  bound  the  return  time  of  the  orbit  of  the 
turning  point,  c  =  to  neighborhoods  of  c.  Next  we  show  that  fp  satisfies  (CPI)  and 
favors  higher  parameters  on  a  positive  measure  of  parameter  values.  This  allows  us  to 
apply  theorem  3.3.1.  Finally  we  show  that  almost  every  orbit  of  these  maps  approach 
arbitrarily  close  to  c  so  that  if  the  orbit,  {/pg(c)}^05  cannot  be  shadowed  then  almost 
all  other  orbits  of  fp^  cannot  be  shadowed  either. 

We  bound  the  return  time  of  the  orbit  of  the  turning  point,  c,  to  neighborhoods  of  c  by 
examining  the  proof  of  theorem  3.4.1.  Specifically,  as  part  of  the  proof  of  theorem  3.4.1, 
Benedicks  and  Carleson  [5]  show  that  for  any  o:  >  0,  there  is  a  set  of  positive  measure 
in  parameter  space,  S{c()  £  Ip,  such  that  if  po  G  S{cx)  then  /p^  satisfies  (CEl)  and  the 
condition: 

14.  W-cl  ><=“"'  P-'®) 
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for  all  i  6  {0, 1, 2, The  set,  S{a),  has  a  density  point  at  parameter  value  p  =  4. 

Next  we  show  that  /p  satisfies  (CPI)  and  favors  higher  parameters  on  a  subset  of  ^(a) 
of  positive  measure.  This  is  basically  done  by  looking  at  what  happens  for  p  =  4  and 
extrapolating  that  result  for  parameters  in  a  small  interval  in  parameter  space  around 
p  =  4.  The  result  only  works  for  those  values  of  p  for  which  /p  satisfies  (CEl).  However, 
since  p  =  4  is  a  density  point  of  5(q:),  for  any  o:  >  0,  there  is  a  set,  <S'*(ct),  contained  in 
a  neighborhood  p  =  4  with  a  density  at  p  =  4  for  which  po  €  S^.{a)  implies  /p^  satisfies 
(CEl)  and  (3.15),  and  fp  favors  higher  parameters  and  satisfies  (CPI)  at  p  =  po- 

Then  by  applying  theorem  3,3.1  we  see  that  there  exist  constants  ^  >  0,  it'o  >  0  and 
Ki  >  0  such  that  for  any  a  >  0,  if  po  €  5'*(Qf)  then  the  orbit,  {/pg(c)}^05  cannot  be 
shadowed  by  any  orbit  of  fp  for  p  €  (po  —  ^,Po  —  (recall  that  ng(c,  e,po) 

is  defined  to  be  the  smallest  integer  n  >  1  such  that  |/"(c,po)  —  c|  <  e.)  By  controlling 
a  >  0  in  (3.15)  we  can  effectively  control  ne(c,  e,po)  to  be  whatever  we  want.  Thus 
for  any  7  >  0  we  can  choose  a  set  £(7)  C  Ip  with  a  density  point  at  p  =  4  such 
that  if  Po  G  E{'y)  then  fp^  satisfies  (CEl)  and  no  orbits  of  fp  e— shadow  the  orbit, 
{/*g(c)}^oj  fcJ"  P  ^  {Po  ~  ^iPo  ~  Ko{Ki^V)-  But  since  7  >  1,  if  we  set  constant 
K  =  xaax{KQKi,  Ki)  >  0  we  see  that  po  —  K(i{Kiey  >  po  —  {Ke)^  for  any  e  >  0.  Thus, 
no  orbits  of  fp  may  e-shadow  {/po(c)},^o>  if  P  €  (po  -  ^,Po  -  {K^V)- 

Finally  it  is  known  that  if  fp^  satisfies  (CEl)  then  almost  every  orbit  of  fp^  approaches 
arbitrarily  close  to  c.  Thus  for  almost  all  xq  €  lx,  the  orbit,  {fpg{xo)}^o^  cannot  be 
shadowed  by  an  orbit  of  fp  if  the  orbit,  {/pg(c)}^o,  cannot  be  shadowed  by  any  orbit 
of  fp.  Consequently,  we  see  that  for  any  7  >  1  if  po  G  E{j)  then  /p^  satisfies  (CEl)  and 
almost  no  orbits  of  fp^  can  be  shadowed  by  any  orbit  of  /p  if  p  G  (po  —  ^jPo  “  {HEf). 
This  would  prove  parts  (1)  and  (3)  of  the  theorem. 

Part  (2)  of  theorem  3.4.2  is  a  direct  result  of  corollary  3.3.1  and  the  following  result, 
due  to  Milnor  and  Thurston  [34]: 

Lemma  3.4.1  The  kneading  invariant,  D{fp,t),  is  monotonically  decreasing  with  re¬ 
spect  to  p  for  all  p  G  Ip. 

Thus  if  fpg  satisfies  (CEl)  for  some  po  G  E('y),  there  exists  a  constant  C  >  0  such  that 
any  orbit  of  fp^  can  be  e— shadowed  by  an  orbit  of  /p  if  p  G  [po,Po  +  Ce^].  This  proves 
part  (2)  of  the  theorem. 


3.5  Remarks  on  convergence  of  parameter  estimates 

In  order  to  determine  the  feasibility  of  parameter  estimation  applications,  it  is  important 
to  have  some  idea  about  how  many  state  samples  are  likely  to  be  needed  in  order  to  attain 
a  certain  accuracy  in  the  parameter  estimate.  Ergodic  theory  comes  into  play  here,  since 
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we  would  like  to  consider  the  behavior  of  typical  orbits.  In  particular,  suppose  that  a  data 
stream  is  generated  from  an  initial  condition  that  is  chosen  at  random  after  the  system 
has  settled  into  its  equilibrium  behavior.  We  would  like  to  estimate  the  rate  at  which  a 
parameter  estimate  is  likely  to  converge  with  increasing  numbers  of  measurements  from 
the  data  stream.  In  this  section,  we  outline  ideas  on  how  to  approach  this  question,  and 
make  certain  conjectures  about  convergence  results.  These  conjectures  closely  match 
numerical  results  attained  from  actual  parameter  estimation  techniques  as  shown  in 
chapter  6  of  this  report. 

We  have  already  seen  that  the  accuracy  of  a  parameter  estimate  for  a  piecewise 
monotone  map  depends  on  how  close  the  orbit  being  sampled  comes  to  the  turning 
points  of  the  map.  When  an  orbit  comes  close  to  a  turning  point,  nearby  regions  in 
state  space  are  subject  to  a  folding  effect  that  enables  us  to  distinguish  small  differences 
in  parameters  based  on  state  data.  With  a  given  level  of  measurement  noise,  e,  there 
often  exists  a  lower  limit  on  the  parameter  estimation  accuracy  resulting  from  folding 
and  refolding  effects  near  turning  points  (see  theorem  3.2.1).  This  bound  is  related  to 
the  amount  of  time  it  takes  for  an  orbit  near  a  turning  point  to  return  within  e  distance 
of  a  turning  point.  For  most  numerical  purposes,  however,  this  lower  limit  is  often  too 
small  to  be  of  practical  importance.  Thus,  it  is  important  to  consider  the  approximate 
rate  at  which  a  parameter  estimate  is  likely  to  converge,  before  the  system  reaches  the 
lower  limit  in  the  accuracy  of  the  parameter  estimate. 


Assuming  that  a  family  of  piecewise  monotone  maps,  {fp\p  €  /p  C  E}  has  the  same 
number  of  turning  points  for  all  p  €  /p,  this  turns  out  to  be  equivalent  to  asking  the 
following  question:  Given  a  typical  orbit,  of  /po  (with  po  €  4),  as  N  increases, 

for  what  parameter  values,  p,  do  there  exist  shadowing  orbits,  {yn(p)}n=i»  /p» 

that  yn{p)  and  Xn  He  on  the  same  monotone  branch  of  /p^  for  each  n  G  {1, 2 . . . ,  iV}.  In 
other  words,  if  cj  <  C2  <  . .  •  <  are  the  turning  points  of  /p  for  all  p  €  /p,  then  for 
any  n  G  {1,2...,  N},  we  require  that  Xn  €  [c,-,  c,+i]  implies  yn{p)  €  [ci,  Cj+i].  This  makes 
sense  because  the  lower  limit  in  the  accuracy  of  the  parameter  estimate  results  from 
the  fact  that  orbits  can  shadow  each  other  by  evolving  on  different  monotone  branches, 
so  that  state  space  regions  around  an  orbit  for  a  map  with  leading  parameters  get 
refolded  more  than  regions  around  shadowing  orbits  for  maps  with  lagging  parameters. 
Henceforth,  given  the  family,  /p,  of  piecewise  monotone  maps  described  above,  we  will 
say  that  a  sequence  of  points,  {yn}n=i^  e-monotone-shadows  an  orbit  {^n}n=lJ  /po 
if  j/„  and  lie  on  the  same  monotone  branch  of  /p^  for  each  n  G  {1, 2 . . . ,  TV}  and  if 
|j/n  —  Xn\  <  e  for  each  n  G  {l,2...,iV}. 


Using  these  ideas,  we  make  the  following  conjectures; 

Conjecture  1:  Consider  the  family  of  tent  maps,  {pp  :  4 

[0,1], 


Pp(^) 


px  if  X  <\ 
p(l  -x)  ifx  >  \ 


4  Ip  e  4},  where  4  = 
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and  Ip  =  (|,2].  Given  e  >  0  and  po  €  Ip,  for  almost  all  xq  €  Ix  there  is  a  constant 
K  >  0  such  that  for  each  positive  integer  N,  there  exists  a  p  E  (po  —  Kjj,po]  such  that 
the  orbit  {gpg{xo)}n=o  e-monotone-shadowed  by  any  orbit  of  Pp. 

It  turns  out  that  numerical  results  indicate  that  the  error  in  the  estimate  of  the 
parameters  of  the  tent  map  tends  to  converge  at  a  rate  proportional  to  ^  where  N  is 
the  number  of  observations.  Similarly  we  have: 

Conjecture  2:  Consider  the  family  of  quadratic  maps,  {fp  :  Ix  — >  Ix\p  G  Ip},  where 
Ix  —  [Oj  1]) 

fp{x)=px{l-x)  (3.16) 

and  /'  =  (2,4].  Then  there  exists  a  set  E  G  Ip  of  positive  Lebesgue  measure  such  that  if 
Po  €  E,  then  given  e  >  0,  for  almost  all  xq  €  Ix,  there  is  a  constant  >  0  such  that  for 
each  positive  integer  N,  there  exists  ap  ^  (po  —  E^,po]  such  that  the  orbit  {5fpo(a:o)}^o 
is  not  e-monotone-shadowed  by  any  orbit  of  fp. 

Furthermore,  we  expect  that  the  error  in  the  parameter  estimate  of  the  quadratic 
map  should  converge  at  a  rate  proportional  to  where  N  is  the  number  of  observations 
processed.  In  chapter  6,  we  will  see  that  this  appears  to  agree  with  numerical  results. 

The  rest  of  this  section  will  be  devoted  to  motivating  these  two  conjectures.  In  order 
to  estimate  the  convergence  rate  of  the  parameter  estimate,  we  first  need  an  estimate  of 
how  fast  an  orbit  is  likely  to  approach  a  turning  point.  It  turns  out  that  the  maps  we 
are  interested  in  are  ergodic  so  that  the  long  term  average  behavior  of  almost  all  orbits 
of  the  maps  can  be  described  by  the  appropriate  invariant  measure  of  the  map.  Thus, 
in  order  to  estimate  how  fast  most  orbits  approach  a  turning  point  of  map,  it  is  helpful 
to  examine  the  invariant  measures  of  the  map. 

p  is  said  to  be  an  invariant  measure  of  the  map  h  :  Ix  Ix  m(^~^(^))  =  p{-^) 
for  any  open  set  A  C  h-  Every  ergodic  map,  h  :  Ix  Ix,  has  an  associated  invariant 
measure,  p,  such  that  for  any  continuous  function  <f> :  Ix  ^  M.,  the  relation, 

HPM)  =  f  (f>{x)p{dx), 

N^OO  Jx^I 

holds  for  /u-almost  all  xq  G  Ix-  Thus,  one  might  say  that  the  “time-average”  equals 
the  “space-average”  of  an  ergodic  map.  The  density,  of  the  measure  p  satisfies  the 
property  that  ^[x)p{dx)  =  p{A)  for  any  open  A  C  Ixf 

Conjecture  1: 

Let  us  now  outline  the  motivation  behind  Conjecture  1.  The  tent  map,  Pp,  is  ergodic 
if  p  €  Ip.  The  density,  of  the  associated  invariant  measure  pp  of  Pp  is  simply  a 

“^For  more  information  regarding  invariant  measures  and  ergodic  theory  of  maps  of  the  interval  please 
refer  to  chapter  V  in  [33]. 


51 


constant  over  the  region,  [s^(c),j,(c)l.  We  expect  that  for  ;,,-almost  aU  initial  conditions 
£  [0, 1]  there  exists  a  jK"  >  0  such  that: 

mm  |s;(ro)  -  <^l  < 

is  satisfied  for  all  n  >  0  if  p  €  Ip- 

Keeping  this  observation  in  mind,  let  c  =  |  be  the  critical  point  of  the  tent  map. 
One  can  show  that  gp  favors  higher  parameters  for  all  p  €  Ip-  In  other  words,  using  the 
same  notation  as  in  (3.7)  and  (3.8),  we  know  that 


sgn{Dpg'^{c,p)}  =  o'n(c,p) 


(3.17) 


for  all  n  >  l.Mt  is  also  not  difficult  to  show  that  there  exist  constants  >  0  and 
K2  >  0  such  that 


Kip”  <  \Dpg^{c,p)\  <  K2P^. 


(3.18) 


for  all  n  >  1  if  p  €  Ip- 

Now  given  po  €  Ip  and  an  initial  condition  xo  G  [0, 1],  consider  the  finite  orbit 
(xo)}^-o-  We  would  like  to  determine  if  there  is  an  orbit  of  gp  that  e-monotone- 
shadows  this  orbit  for  p  <  po-  To  first  order,  this  is  basically  determined  by  the  magnitude 

of 


(3.19) 


because  regions  of  state  space  near  the  critical  point  c  get  folded,  producing  the  leading 
and  lagging  behavior  which  in  turn  leads  to  asymmetrical  shadowing  in  parameter  space. 
Since  the  tent  map  favors  higher  parameters,  gp  cannot  e-monotone-shadow  the  orbit 
{<(c  +  AN)};r=i  forp<poif: 


<Tn{c,p)[g;,{c  +  I^N)-9;{c)]>^- 


(3.20) 


for  any  n  <  m.  Suppose  that  the  inequality  in  (3.20)  is  false  for  all  n  <  m  1.  Then, 
from  (3.17)  and  (3.18), 


<Tm{C:P)[gZi^  +  An)  -  9p{c)]  >  K^p^iPo  “  P)  "  Po^N- 


Now  suppose  that 


^  XT- 


(3.21) 


^Actually  g.  favors  higher  parameters  for  all  p  €  (1,2].  We  confine  our  discussion  here  to  p  G  7p  - 
(f,  2]  for  convenience  since  (3.17)  may  only  hold  for  n>No  for  some  iVo  >  1  if  P  G  (1,  ^l-  However,  we 
suspect  that  Conjecture  2  also  holds  for  any  p  G  (1,2]. 
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We  find  that: 


«^m(c,p)[^™(c  + Aiv)  -^^(c)]  >  ^  -  1] 

So  for  A^-  sufficiently  small  and  p  €  Ip,  we  can  see  that  the  inequality  in  (3.20)  holds 
for  the  value  m  given  in  (3.21)  if  po  —  P  >  3Ajv. 

Thus,  given  sufficiently  small  An  as  defined  in  (3.19),  there  exists  ap  G  {p  —  3An,Pq] 
such  that  no  orbit  of  gp  can  e— monotone-shadow  the  orbit  {xn}n=^-  Recall,  however, 
that  An  should  decrease  at  rate  at  least  proportional  to  Applying  this  fact,  we  get 
a  result  similar  to  Conjecture  1. 

Conjecture  2: 

Now  let  us  consider  Conjecture  2.  The  basic  idea  here  is  similar  to  Conjecture 
1.  However, Conjecture  2  presents  some  additional  complications.  First  the  invariant 
measures  for  the  quadratic  map  are  more  complicated,  and  cannot  be  written  in  closed 
form.  Second,  there  is  no  uniform  expansion  available  in  state  or  parameter  space,  so 
that  it  is  not  a  simple  matter  to  bound  the  quantity  fp^{c  -f  An)  —  fp{c)  for  small  An 
and  for  p  near  po- 

The  invariant  measures  of  the  quadratic  map  have  been  the  subject  of  vigorous 
research  over  the  past  several  years.  Nowicki  and  van  Strien  show  in  [46]  that  for  the 
maps  given  in  (3.16)  if  fp^  satisfies  (CEl)  for  some  po  €  (1,2],  then  fp^  has  an  ergodic 
invariant  measure  fi  such  that  for  any  measurable  set  A  C  [0, 1]  there  exists  a  constant 
K  >  Q  such  that  p(A)  <  K\A\^  (where  |A|  is  the  Lebesgue  of  the  set  A). 

Now  consider  the  interval  A^  =  (c  —  |e,  c  -t-  |e).  Note  that  there  exists  A''  >  0  such 
that  for  any  e  >  0,  |/(Ae)|  <  K'^.  Thus,  from  Nowicki  and  van  Strien’s  result,  we  know 
that  there  exists  A'l  >  0  such  that  for  any  e  >  0: 

^(A,)  =  p(/,o(Ae))  <  K\fpM^)\^  < 

Furthermore,  it  is  fairly  easy  to  show  that  there  also  exists  K2  >  0  such  that  for  any 
e  >  0  p{Ac)  >  K2e.  Thus,  since  K2e  <  p{A^)  <  Kit  for  any  e,  we  expect  that  for  almost 
all  initial  conditions  xq  G  [0, 1],  the  quantity, 

Aiv(xo)  =  ^min^  -  c|.  (3.22) 

will  decay  at  a  rate  proportional  to 

As  in  Conjecture  1,  given  po  G  Ip,  xq  €  K,  and  the  finite  orbit,  {/po(a:o)}^0! 

/po ,  we  would  like  to  determine  if  there  is  an  orbit  of  fp  that  e— monotone-shadows  this 
orbit  for  some  p  <  po.  As  before,  the  important  statistic  to  know  is  AAr(a:o)  (we  will 
henceforth  assume  that  Xq  is  fixed  and  refer  simply  to  A;v)- 
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As  in  Conjecture  1,  fp  cannot  c-shadow  the  orbit  {fp^ic  +  Ajv)}J^=i  for  p  <  po  if: 

o'i(c,p)[/po(c  +  An)  -  /p(c)]  >  e.  (3-23) 

for  any  i  <  m.  This  corresponds  to  what  happens  when  an  orbit,  {/po(c  + 
of  /po  leads  the  orbit  of  the  critical  point  of  the  map  fp  by  more  than  e  so  there  is 
no  orbit  of  fp  that  can  effectively  shadow  that  orbit.  The  other  way  that  fp  can  fail 
to  e-monotone-shadow  the  orbit  {/^^(c -h  Ajv)}Uo»  is  if  /;(c)  lags  behind  /;„(c  An) 
(by  less  than  e),  but  /‘(c)  and  (c -b  Ajv)  are  on  different  monotone  branches  (ie,  the 
critical  point,  c  =  |  is  between  /*(c)  and  /’^(c-b  Ajv))- 

Thus,  the  prove  the  conjecture  it  is  suflScient  to  show  that  given  e  >  0  sufiSciently 
small,  there  exists  a  constant  A'  >  0  such  that  that  for  each  Ajv  >  0  there  exists  a 
p>  Po-  KA%  and  i  <  -ClogA%  such  that  one  of  the  following  is  satisfied: 

(1)  o-,(c,p)[/;^(c-b  A;v)  -  fpic)]  >  e 

(2)  <Ti(c,p)[4„(c  +  AA/)-/‘(c)]  >0  and  sgn{c-  f;^{c  +  An)}  =  -sgn{c-  f;{c)}. 

The  problem  is  getting  a  estimate  for  /‘^(c  -b  An)  -  /p(c).  Recall  that  near  p  =  4  there 
is  a  set  A  C  (2, 4]  of  positive  Lebesgue  measure  such  that  for  each  po  €  E,  fp^  satisfies 
(CEl),  (CPI),  and  favors  higher  parameters.  Thus  if  po  €  E,  there  exists  a  Ko  >  0  and 
A^o  >  0  such  that 

±  < _ _ <  Ko.  (3.24) 

Kq  D^p-'^{f{c,po),po) 

for  all  i  ^  No-  So,  if  po  €  E,  for  p  <  po  and  each  i  >■  No  we  have  that. 
o-i(c,p)[/po(c  +  An)  -  fpic)] 

=  ^,(c,p)[(/p„(c)  -  fpic))  -  (/p,(c)  -  /;(c  +  A;v)] 

>  (^i{c,p)[{Dpf{c,po){po  -p)  +  0{{po  -  pf) 

-{K'DJ^-\f{c,po),Po)A%  +  0(A^))] 

>  \DJ^-'^f{c,po),Po)\[{po-p)-KoK'Al,  +  0{A%)  +  0{{po-p)^)]  (3.25) 

For  each  i  >  No,  the  left  hand  side  of  (3.23)  tends  to  grow  as  (po  -p)  -  KoKiA%,  at  least 
for  small  A%  and  po  -  p-  Recall  that  DJ^-\f{c,po),Po)  tends  to  grow  exponentially 
with  i  and  Ajv  tends  to  decay  proportional  to  jj.  Thus,  given  e  >  0  one  might  expect 
that  there  exists  A"  >  0  and  C  >  0  such  that  either  condition  (1)  or  (2)  are  satisfied  for 
some  p  >  Po  —  KA]^  and  i  <  —ClogA^. 

This,  however,  is  a  somewhat  rough  calculation,  and  in  order  to  demonstrate  that 
either  conditions  (1)  or  (2)  are  satisfied,  we  need  to  bound  the  higher  order  terms  in 
(3.25).  This  involves  getting  a  uniform  estimate  of  the  relationship  between  Dpf{c,po  - 
Sp)  and  Dxf~^{f{c  -b  6x,po),Po)  for  small  values  of  6p  and  Sx  as  i  increases.  This  does 
not  to  be  a  trivial  task  and  is  something  that  should  be  looked  into  more  carefully  in 
the  future. 
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3.6  Conclusions,  remarks,  and  future  work 


The  primary  goal  of  this  chapter  was  to  examine  how  shadowing  works  in  one-dimensional 
maps  in  order  to  evaluate  the  feasibility  of  parameter  estimation  on  simple  chaotic  sys¬ 
tems.  We  have  been  particularly  interested  in  investigating  how  nonlinear  folding  affects 
parameter  shadowing  and  how  this  might  help  explain  numerical  results  which  show 
asymmetrical  behavior  in  the  parameter  space  of  one-dimensional  maps.  More  specifi¬ 
cally,  for  a  parameterized  family  of  maps,  /p,  it  is  apparently  the  case  that  an  orbit  for  a 
particular  parameter  value,  p  =  po,  is  often  shadowed  much  more  readily  by  maps  with 
slightly  higher  parameter  values  than  by  maps  with  slightly  lower  parameter  values  (or 
vice  versa).  This  phenomenon  has  important  effects  on  the  possibilities  for  parameter 
estimation.  For  example,  if  we  are  given  noisy  observations  of  the  orbit  described  above 
and  asked  what  the  parameter  value  was  of  the  map  that  produced  that  data,  then  we 
would  immediately  be  able  to  eliminate  most  values  less  than  po  as  possible  candidates 
for  the  actual  parameter  value.  On  the  other  hand,  it  may  be  much  more  difficult  to 
distinguish  po  from  parameter  values  slightly  larger  than  po- 

For  piecewise  monotone  maps  with  positive  Lyapunov  exponents,  we  demonstrated 
that  the  folding  behavior  around  a  turning  point  generally  leads  to  asymmetrical  behav¬ 
ior,  unless  the  parameter  dependence  is  degenerate  in  some  way.  In  particular,  images 
of  neighborhoods  of  a  turning  point  under  /p  tend  to  separate  exponentially  fast  for  per¬ 
turbations  in  p.  This  results  in  a  sort  of  lead-lag  phenomenon  as  the  images  for  different 
parameter  values  separate,  causing  images  for  some  parameter  values  to  overlap  each 
other  more  than  others.  Near  the  turning  point,  orbits  for  parameter  values  that  lag 
behind  cannot  shadow  orbits  for  the  parameter  values  that  lead  unless  another  folding 
occurs  because  of  a  subsequent  approach  to  a  turning  point. 

For  the  case  of  unimodal  families  of  maps  with  negative  Schwarzian  derivative,  the 
result  is  sharper.  Apparently,  if  the  parameter  dependence  is  not  degenerate,  and  if 
a  map,  fp^,  has  positive  Lyapunov  exponents  for  some  parameter  value,  po,  then  for 
any  e  >  0  sufficiently  small,  there  exists  (7  >  0  so  that  for  one  direction  in  parameter 
space  (either  p  >  Po  or  p  <  po),  all  orbits  of  fp^  can  be  e— shadowed  by  an  orbit  of 
fp  if  Ip  —  Pol  <  Ce^.  Meanwhile,  in  the  other  direction  in  parameter  space,  there  exist 
constants  ^  >  0  and  K  >  0  so  that  for  any  7  >  1  there  is  a  positive  Lebesgue  measure  of 
parameter  values  such  that  if  |p— po|  <  then  almost  no  orbits  of  fp^  can  be  e— shadowed 
by  any  orbit  of  fp  if  |p  —  po|  >  {Key.  This  clearly  illustrates  some  sort  of  preference  of 
direction  in  parameter  space. 

One  might  also  note  that  this  result  demonstrates  that  all  orbits  of  certain  chaotic 
(nonperiodic)  systems  can  be  shadowed  by  orbits  of  systems  dominated  by  hyperbolic 
periodic  attractors  (consider,  for  example,  the  quadratic  map,  fp{x)  =  pa:(l  —  a:)).  Shad¬ 
owing  results  have  sometimes  been  cited  to  justify  the  use  of  computers  in  analyzing 
dynamical  systems,  since  if  one  numerically  iterates  an  orbit  and  finds  that  it  is  chaotic. 
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then  similar  real  orbits  must  exist  in  that  system  (or  nearby  systems).  This  is  true,  but 
one  should  also  be  careful,  because  the  real  orbits  that  shadow  a  numerically  generated 
trajectory  are  often  purely  pathological  (ie,  such  orbits  are  often  not  qualitatively  similar 
to  typical  orbits  of  the  system). 

In  any  case,  many  questions  related  to  this  material  still  remain  unanswered.  It 
seems  to  be  quite  difficult  to  come  up  with  crisp  general  results  when  it  comes  to  a 
general  topic  like  parameter  dependence  in  families  of  maps.  For  instance,  I  do  not 
know  of  a  simple  way  of  characterizing  exactly  when  parameter  shadowing  favors  one 
direction  over  the  other  in  parameter  space  for  piecewise  monotone  maps.  For  unimodal 
maps,  it  appears  that  perhaps  a  useful  connection  to  topological  entropy  may  be  made. 
If  topological  entropy  is  monotonic,  and  if  there  is  a  change  in  the  topological  entropy 
of  map  fp  with  respect  to  p  at  p  =  po  then  certain  asymmetrical  shadowing  results  seein 
likely  for  orbits  of  fp,.  However,  topological  entropy  does  not  appear  to  be  an  ideal 
indicator  for  asymmetrical  shadowing,  since  it  is  global  in  nature.  On  the  other  hand, 
if  a  piecewise  monotone  map  has  multiple  turning  points,  it  is  possible  for  some  turning 
points  to  favor  higher  parameters  while  other  turning  points  favor  lower  parameters. 
Such  examples  are  interesting,  from  a  parameter  estimation  point  of  view,  because  that 
means  that  one  may  be  able  to  effectively  squeeze  parameter  estimates  within  a  narrow 
band  of  uncertainty  as  the  orbit  being  sampled  passes  close  to  turning  points  which  favor 
different  directions  in  parameter  space. 
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Chapter  4 


General  nonuniformly  hyperbolic 
systems 

In  this  chapter  we  examine  shadowing  behavior  for  general  one-parameter  families  of 

ditfeomorphisms,  /p  :  M  M  for  p  €  R  where  M  is  a  smooth  compact  manifold. 
We  want  to  consider  why  orbits  shadow  each  other  (or  fail  to  shadow  each  other)  in 
maps  that  are  nonuniformly  hyperbolic.  This  is  important  to  inyestigate  so  that  we 
can  properly  evaluate  the  feasibility  of  parameter  estimation  in  a  wide  class  of  chaotic 
systems. 

The  exposition  in  this  chapter  will  not  be  rigorous.  Most  of  the  arguments  will 
be  qualitative  in  nature.  Our  goal  here  is  to  motivate  some  possible  mechanisms  that 
might  help  explain  results  from  numerical  experiments.  In  particular  we  will  attempt 
to  draw  analogies  to  our  work  in  chapter  3  to  help  explain  what  may  be  happening  in 
multi-dimensional  systems. 


4.1  Preliminaries 

Let  us  first  outline  some  basic  concepts. 

We  start  by  introducing  the  notion  of  Lyapunov  exponents.  Let  /  :  M  ^  M  be  a 
diifeomorphism.  Suppose  that  M  is  a  compact  9— dimensional  manifold  and  that  for 
some  X  E  M  there  exist  subspaces,  W  =  El  D  D  .. .  in  the  tangent  space  of  /  at  a; 
such  that: 

Aj,  =  Jm  hog\Dr(x)u\  if  t<  €  \  Bi"'. 

for  some  numbers  >  A^  >  ....  Then  the  A^’s  are  the  Lyapunov  exponents  of  the 
orbit,  {/‘(a:)}.  Oseledec’s  Multiplicative  Ergodic  Theorem  ([48])  demonstrates  that  for 
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any  /-invariant  probability  measure,  (jl,  these  Lyapunov  exponents  exist  for  //-almost  all 
X  e  M. 

If  there  are  no  A^’s  equal  to  zero,  then  there  exist  local  stable  manifolds  at  x  tangent 
to  the  linear  subspa'ce,  if  A;  <  0.  There  also  exists  an  analogous  unstable  manifold. 
In  other  words,  for  almost  any  x  e  M  there  exists  an  e  >  0  such  that: 

W/(x,  /)  =  {y&M:  d{r{x),  r{y))  <  e  for  all  n  >  0  } 

W^{x,  f)  =  {yeM:  d(/"”(x),  /""(j/))  <  e  for  all  n  >  0  } 

These  manifolds  are  locally  as  differentiable  as  /.  This  result  is  based  on  Pesin  [52]  and 
Ruelle  [54].  The  difference  between  these  manifolds  and  manifolds  for  the  uniformly 
hyperbolic  case  is  that  these  manifolds  do  not  have  to  exist  everywhere,  the  angles 
between  the  manifolds  can  approach  zero,  and  the  neighborhoods,  e,  can  be  arbitrarily 
small  for  different  x  €  M. 

We  can  also  define  global  stable  and  unstable  manifolds  as  follows: 

W^X,  f)  =  {yeM  :  d(/"(x),  /”(y))  ->  0  as  n  ->  oo} 

W“(x,/)  =  {y  e  M  :  d(/""(x),/""(2/)) 0  as  n oo}. 

Note  that  these  manifolds  are  invariant  in  the  sense  that  /(W®(x,/))  =  W^{f{x),f). 
Although  locally  differentiable,  the  manifolds  can  have  extremely  complicated  structure 

in  general. 


4.2  Discussion 


We  now  return  to  the  investigation  of  shadowing  orbits. 

There  have  been  some  attempts  to  examine  the  linear  theory  regarding  nonuniformly 
hyperbolic  maps  in  order  to  make  statements  about  shadowing  behavior  (see  for  exam¬ 
ple  [24]).  However,  since  the  nonexistence  of  shadowing  orbits  fundamentally  results 
from  degeneracy  in  the  linear  theory,  it  is  also  be  useful  to  consider  what  happens  in 
terms  of  the  structure  of  nearby  manifolds. 

For  almost  every  x,  /  looks  locally  hyperbolic.  However,  in  nonhyperbolic  systems 
if  we  iterate  the  orbit  {/‘(x)},  we  will  eventually  approach  some  sort  of  degeneracy. 

For  example,  one  possible  scenario  is  that  for  some  point  a  €  {fix)},  W^{a^  f) 
and  IF"(a,  /)  are  nearly  tangent  and  intersect  each  other  at  some  nearby  point,  y.  As 
illustrated  in  figure  4.1,  this  structure  implies  a  certain  scenario  for  the  evolution  of 
the  manifolds  as  we  map  forward  with  /  or  backward  with  f'K  We  will  argue  that  this 
situation  is  in  some  sense  a  multidimensional  analog  for  the  folding  behavior  we  observed 

in  one  dimension. 
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Figure  4.1:  Possible  situation  near  a  homoclinic  tangency.  Note  how  a  fold  in  the  unstable 
manifold  is  created  as  we  map  ahead  by  /",  and  a  fold  in  the  stable  manifold  is  created  as  we 
map  back  by 
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Figure  4.2;  An  illustrative  example  of  how  homoclinic  tangencies  can  cause  problems  for 
shadowing. 


For  one  thing,  the  homoclinic  intersection  of  manifolds  can  prevent  or  at  least  hamper 
shadowing.  We  illustrate  this  in  figure  4.2.  Consider  for  example  two  nearby  points  a 
and  b  such  that  d{a,  b,)  <  S  and  let  {cn}  be  a  pseudo-orbit  of  /  with  the  following 
form: 


/"(a)  if  n  <  0 
/"(6)  if  n  >  0 


In  a  uniformly  hyperbolic  scenario  as  shown  in  figure  4.2(a),  we  can  easily  pick  a  suitable 
orbit  to  shadow  {c„},  namely  {f{z)}  where  2  =  W“(c,/)  n  W/(6,/).  However  if  a 
homoclinic  intersection  is  nearby  as  in  figure  4.2(b),  we  see  that  there  is  no  obvious  way  to 
pick  a  shadowing  orbit,  since  there  may  be  no  point  z  satisfying  z  =  ^“(a,  /)  fl  VF/(6,  /). 
Note  that  the  difficulty  in  finding  a  shadowing  orbit  seems  to  depends  on  how  close  a  is 
to  the  homoclinic  tangency,  and  the  geometry  of  the  manifolds  nearby. 

Homoclinic  tangencies  could  also  cause  asymmetrical  shadowing  in  parameter  space. 
Numerical  experiments  with  maps  that  favor  higher  parameters  seem  to  show  the  follow¬ 
ing  scenario:  As  we  map  a  state  space  region  near  a  homoclinic  tangency  ahead  by  fp^ 
repeatedly,  a  tongue,  or  fold  of  the  unstable  manifold  develops  as  the  manifold  expands. 
If  we  examine  the  corresponding  situation  in  a  map  with  a  slightly  higher  parameter 
value,  we  find  that  the  corresponding  fold  in  the  unstable  manifold  for  the  higher  pa¬ 
rameter  system  overlaps  the  fold  in  the  unstable  manifold  of  the  original  system.  In  this 
case  we  expect  that  the  original  system  would  have  difficulty  shadowing  a  trajectory 
close  to  the  apex  of  the  fold  in  the  higher  parameter  system.  This  situation  is  depicted 
in  figure  4.3.  A  similar  argument  works  for  f~^.  Numerical  results  seem  to  indicate  that 
for  many  families  of  systems  at  least,  there  is  an  ordering  in  parameter  space  such  that 
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Figure  4.3:  Why  higher  dimensional  maps  might  exhibit  asymmetrical  shadowing  in  parameter 
space. 

as  we  increase  (or  decrease)  the  parameter  value,  the  systems  get  progressively  more 
“flexible”  in  sense  that  systems  that  are  more  flexible  can  always  shadow  orbits  of  sys¬ 
tems  that  are  less  flexible.  Numerical  evidence  for  this  type  of  shadowing  behavior  can 
be  found  in  chapter  6. 


Wlf”(a)j) 


Figure  4.4:  Refolding  after  a  subsequent  encounter  with  a  homoclinic  tangency. 

Also  recall  that  with  maps  of  the  interval,  a  folded  region  can  get  refolded  upon  a 
subsequent  encounter  with  a  turning  point.  A  similar  thing  can  also  happen  in  higher 
dimensions.  Consider  figure  4.4  for  example.  Here  we  see  that  the  folded  tongue  of  the 
unstable  manifold  gets  refolded  back  on  itself,  possibly  allowing  lagging  orbits  to  catch 
up  so  that  shadowing  is  possible.  This  suggests  that  there  may  be  interesting  shadowing 
results  of  the  sort  described  in  chapter  3  for  one  dimension.  The  situation  here,  however, 
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is  more  complicated  since  in  one  dimension  there  were  only  a  finite  number  of  sources 
of  folding,  namely  the  turning  point,  while  here  there  are  likely  to  be  an  infinite  number 
of  sources  for  the  folding. 
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Chapter  5 


Parameter  estimation  algorithms 


5.1  Introduction 

In  this  chapter  we  present  new  algorithms  for  estimating  the  parameters  of  chaotic 
systems.  In  particular  we  will  be  interested  in  investigating  estimation  algorithms  for 
nonuniformly  hyperbolic  dynamical  systems,  because  these  systems  include  most  of  the 
chaotic  systems  likely  to  be  encountered  in  physical  applications.  From  our  discussion 
in  chapters  3  and  4,  we  know  that  there  are  three  basic  effects  that  are  important  to 
consider  when  designing  a  parameter  estimation  algorithm  for  nonuniformly  hyperbolic 
dynamical  systems:  (1)  most  data  points  contribute  very  little  to  our  knowledge  of  the 
parameters  of  the  system,  while  a  relatively  few  data  points  may  be  extremely  sensitive  to 
parameters,  (2)  the  sensitive  sections  of  orbits  reflect  nearby  folding  behavior  which  must 
be  accurately  modeled  in  order  to  extract  information  about  the  parameters,  and  (3) 
the  folding  behavior  often  results  in  asymmetrical  shadowing  behavior  in  the  parameter 
space  of  the  system,  so  we  can  generally  eliminate  only  parameters  slightly  less  than 
or  slightly  greater  than  the  actual  parameter  value.  The  goal  is  to  develop  an  efficient 
algorithm  that  takes  all  three  of  these  effects  into  account. 

Our  basic  strategy  will  be  to  take  advantage  of  property  (1)  above  by  using  a  linear 
filtering  technique  to  scan  through  most  of  the  data  and  attempt  to  locate  parts  of  the 
trajectory  where  folding  occurs.  In  sections  of  the  trajectory  where  folding  does  occur, 
we  will  examine  the  data  closely  using  a  type  of  Monte-Carlo  analysis  which  we  have 
designed  to  circumvent  the  numerical  pitfalls  that  accompany  work  with  chaotic  systems. 

We  begin  this  chapter  by  surveying  some  traditional  filtering  techniques  and  exam¬ 
ining  some  basic  approaches  for  parameter  estimation  problems  (section  5.3).  Those 
readers  who  are  familiar  with  traditional  estimation  theory  may  wish  to  skim  these 
sections.  We  go  on  in  section  5.4  to  examine  how  and  why  traditional  algorithms  fail 
in  high-precision  estimation  of  chaotic  systems.  We  then  propose  a  new  algorithm  for 
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estimating  the  parameters  of  a  chaotic  system  in  one  dimension  (section  5.5).  This 
algorithm  is  generalized  in  section  5.6  to  deal  with  systems  in  higher  dimensions. 

Numerical  results  of  these  algorithms  describing  the  performance  of  these  techniques 
are  presented  in  chapter  6. 


5.2  The  estimation  problem 

Let  us  begin  by  restating  the  problem.^  Let: 

Xn+l  =  fpi^n) 

and  Vn  =  Xn  +  Vn  (5-2) 

where  is  the  state  of  the  system,  y.  are  observations,  u„  represents  noise,  /  evolves 
the  state,  p  €  /p  C  K  is  the  scalar  parameter  we  are  trying  to  estimate,  and  Ip  is  a  closed 

interval  of  the  real  line. 

It  will  also  be  useful  to  write  the  system  in  (5.1)  and  (5.2)  in  terms  of  Un  =  (xn,p), 
a  combined  vector  of  state  and  parameters. 

Un+l  =  gM 

y„  =  HnUn  +  '^n 

where  the  map,  g,  satisfies  g{x,p)  =  {fp{x)iP)i  and. 


Hr.  = 


/,  0 


where  /,  is  a  y  x  y  identity  matrix  if  the  state,  x,  has  dimension  q. 

We  now  make  a  few  remarks  about  notation.  In  general,  throughout  this  chapter, 
the  letters  x,  p,  u  will  correspond  to  state,  parameter,  and  state-parameter  vectors.  Set 

X”  =  (Xo,Xi,...,Xn),  y"  =  (yo,yi,--->yn)5  3-nd  U"  =  (uo,Ui,  .  .  .,Un). 

The  symbol  above  a  vector  will  be  used  to  denote  an  estimate.  For  example, 
the  estimate  of  the  parameter  p  based  on  the  observations  in  y"  will  be  denoted  p„. 
We  will  also  use  the  notation,  Un\kr  to  denote  an  estimate  of  u„  based  on  observations, 
yK  Similarly,  the  symbol  will  be  used  to  denote  an  error  quantity.  For  example  we 

might  write  that  Un  =  Un  —  Un\n- 

iNote  that  the  setup  in  (5.1)  and  (5.2)  is  somewhat  less  general  than  standard  formulations  of 
filtering  problems.  For  example  one  could  add  an  extra  term,  u;„.  to  represent  the  system  noise  so  that 
a:  J-I  =  fnlzn)  -f  U)„,  or  one  could  add  an  extra  function,  K{x),  so  that  y„  _  hn{xn)  -f  u„,  to  reflect 
the  fact  that  the  observations  might  represent  a  more  general  function  of  the  sUte.  However  we  have 
elected  to  keep  problem  as  simple  as  possible  in  order  to  concentrate  on  how  chaos  affects  estimation, 
and  to  be  consistent  with  the  presentation  in  chapters  2-4. 
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5.3  Traditional  approaches 


We  now  examine  some  basic  methods  for  approaching  parameter  estimation.  In  sec¬ 
tions  5.3.1  and  5.3.2  we  mainly  concentrate  on  providing  the  motivation  behind  linear 
techniques  like  the  Kalman  filter.  This  treatment  is  extended  in  section  5.3.3,  where 
nonlinear  techniques  are  discussed  in  more  detail.  The  material  in  this  section  is  well- 
known  in  the  engineering  community,  but  we  explain  it  here  because  it  provides  the 
basis  for  new  algorithms  we  develop  later  to  deal  with  chaotic  systems. 

There  are  a  variety  of  ways  to  approach  parameter  estimation  problems.  Engineers 
have  developed  a  whole  host  of  ad  hoc  tricks  that  may  be  applied  in  different  situations. 
The  basic  idea,  however,  is  relatively  simple.  Given  observations,  {yk]'k-oi  ^  model 
for  /p,  we  would  like  to  pick  our  parameter  estimate,  p  =  pn,  so  that  there  exists  an 
orbit,  {xk{p)}n=Q-:  of  fv  makes  the  residuals, 

^k{p)  =  yk-  Xk{p) 

as  small  as  possible  for  A:  €  {0, 1, . . .  ,n}.  In  order  to  choose  the  best  possible  estimate, 
Pn,  we  need  some  criteria  for  evaluating  how  small  these  residuals  are. 

From  here,  there  are  a  number  of  different  ways  to  approach  the  problem  of  how 
to  choose  the  optimizing  criteria  to  make  use  of  all  the  known  information.  In  fact, 
the  recursive  Kalman  filter  itself  has  many  different  possible  interpretations.  Many  of 
the  different  approaches  to  parameter  estimation  provide  interesting  insight  into  the 
estimation  problem  itself.  Our  objective  here  will  be  to  motivate  some  of  the  different 
ideas  on  how  to  look  at  parameter  estimation,  without  getting  immersed  in  specific 
derivations.  The  reader  may  consult  [3],  [29],  or  [23]  for  more  detailed  and/or  formal 
treatments  of  this  subject. 


5.3.1  Nonrecursive  estimation 


Least  squares  estimation 

One  of  the  simplest  ideas  about  how  to  estimate  parameters  is  to  choose  the  estimate 
pn  SO  that  p  =  Pn  minimizes  the  quantity: 


S'niP)  = 


inf 

a(p)}?=oe^(p) 


-  Xi\n{p)f{Rd  \yi  -  ^i\n{p))] 


i=0 


(5.6) 


where  Z(p)  is  the  set  of  all  orbits  of  fp  and  are  symmetric  positive-definite 

matrices  that  weight  the  relative  importance  of  various  measurements.  This  sort  of 
idea,  known  as  /east  squares  estimation,  dates  back  to  Gauss  [22]. 

The  formulation  in  (5.6)  is  not  really  useful  for  estimating  parameters  in  practice, 
since  there  is  no  direct  way  of  choosing  p„  to  minimize  (5.6).  Things  become  more 
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concrete,  however,  if  we  assume  the  function  g  in  (5.4)  is  linear  in  both  state  and 
parameters.^  In  this  case  we  can  write: 

y"  =  GnUo  +  n”  (5-7) 

where  Gn  is  a  constant  matrix  that  effectively  represents  the  dynamics  of  the  system. 
Our  goal  is  to  get  a  good  estimate  for  uq  =  (xo,p)  based  on  the  observations  in  j/".  In 
this  case,  least  squares  estimation  amounts  to  minimizing 

=  (y"  -  G.{uo)fK'(v"  -  G'.Cx”))  (5-8) 

with  respect  to  uq  where  are  positive-definite  weighting  matrices.  Our  estimate  for 
Uo  based  on  y",  uo|n  =  (xo|n,Pn))  is  the  value  of  uq  that  minimizes  Sn(uo).  We  can  find 
the  appropriate  minimum  of  Sn(uo)  by  taking  the  derivative  of  Sn  with  respect  to  uq.  If 
we  do  this  we  find  that  thus  value  of  uo  that  minimizes  5„(uo)  is: 

=  (5.9) 


where  G^  denotes  the  transpose  of  Gn- 

Stochastic  framework 

Another  way  to  approach  the  problem  is  to  think  of  Un,  Vn-,  ^-s  random 

variables.  We  shall  assume  that  the  VnS  are  independent  random  variables  with  zero 
mean.  The  idea  is  to  choose  a  parameter  estimate,  p„,  based  on  y”  so  that  the  residuals, 
e,(p)  =  yi  —  Xi(p),  are  as  close  to  zero  as  possible  in  some  statistical  sense  for  i  G 
{0,1,. ..,n}. 

We  can  write  the  probability  density  function^  for  u„  given  y*  according  to  Bayes 
rule: 


P(u,\y'‘)  = 


P(y'‘) 


(5.10) 


These  density  functions  describe  everything  we  might  know  about  the  states  and  param¬ 
eters  of  the  system.  Later  we  will  examine  more  closely  how  tracking  such  probability 
densities  in  full  can  provide  information  about  how  to  choose  parameter  estimates,  es¬ 
pecially  in  cases  involving  nonlinear  or  chaotic  systems.  To  start  with,  however,  we 
concentrate  on  examining  conventional  filters  which  look  only  at  first  and  second  order 
moments  of  these  densities. 

^Note  that  this  assumption  is  extremely  restrictive  in  practice,  since  even  if  the  system  is  linear 
with  respect  to  state,  it  is  generally  nonlinear  with  respect  to  combined  states  and  parameters.  The 
purpose  of  this  example,  however,  is  to  simply  motivate  linear  ideas.  We  address  nonlinearity  in  the 
next  section. 

^Contrary  to  common  convention,  our  choice  of  the  letter  p  for  the  parameter  necessitates  using  a 
capital  P  to  denote  probability  density  functions.  Thus  P{un\y’‘)  represents  the  density  for  for  u„  given 
the  value  of  y*. 
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Minimum  variance 

Given  the  density  function,  P{uQ\y'^),  one  approach  is  to  pick  the  estimate,  uo|n,  to 
minimize  the  variance, 

f;[(uo  -  UolnrK  -  «Oln)]  (5.11) 

where  E[x]  =  /  xP[x)dx  denotes  the  expected  value  of  x.  This  criterion  is  called  the 
minimum  variance  condition.  It  turns  out  that  this  estimator  has  particularly  nice 
properties.  For  instance,  it  is  not  hard  to  show  (e.g.,  [57])  that  the  Mo|n  that  minimizes 
(5.11)  also  satisfies: 

Uo|«  =  E[uo\y'^]. 

for  any  density  function,  P{uo\y'^). 

Now  suppose  that  g  is  linear  in  state  and  parameters  so  that  (5.7)  is  satisfied.  Let 
us  attempt  to  find  the  so  called  optimal  linear  estimator. 

^0\n  —  ^ny  T 

where  the  constant  matrix.  An,  and  constant  vector,  are  chosen  to  minimize  the 
variance  condition  in  (5.11).  Assuming  that  the  estimator  is  unbiased  (i.e.,  E{uo  — 
u(n|0))  =  0)  then: 

bn  =  E{uo)  -  AnEiy”-). 

Minimizing  £^[(^0  —  'Uo|n)^(^i”  —  Wo|n)]  we  find  ([57])  that 

A,  =  (g-'  +  GlR-^G)-'^G'^R-^  (5.12) 

where  Q  =  E[uqUq]  is  the  covariance  matrix oi uq  and  Rn  =  £'['y"(u")^]  is  the  covariance 
matrix  of  u”.  Thus  we  have: 

uo\n  =  E{uo)  +  A„(y"  -  E[y^])  (5.13) 

where  An  is  as  given  in  (5.12).  Comparing  this  result  with  (5.9)  we  see  that  the  tio|n 
above,  which  we  derived  as  the  linear  estimator  with  minimum  variance,  actually  looks 
a  lot  like  the  estimator  from  the  deterministic  least  squares  approach  except  for  the 
addition  of  a  priori  information  about  uq  (in  the  form  of  E{uq)  and  the  covariance 
Q).  With  the  minimum  variance  approach,  the  weighting  factor  Rn  also  has  a  definite 
interpretation  as  the  covariance  of  the  measurement  noise. 

Furthermore,  if  we  assume  that  and  are  Gaussian  random  variables,"*  and 
attempt  to  optimize  the  estimator  uo|n  for  minimum  variance,  we  again  find  (see  [30]) 
that  Uoin  has  the  form  given  in  (5.12)  and  (5.13). 

■*A  random  variable  v  €  has  Gaussian  distribution  if 

p(i;)  =  1 

(27r)§ 

where  E"]?;]  is  the  expected  value  of  v  and  =  E[vv^]  is  the  covariance  matrix  of  v. 
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Thus,  in  summary,  we  see  that  the  optimal  estimator,  Uo|n  as  given  in  (5.12)  and 
(5.13)  has  a  number  of  different  interpretations.  If  the^  system,  g,  is  linear  then  the 
estimator  can  be  thought  of  as  resulting  from  a  deterministic  least  squares  approach.  If 
Un  and  Vn  are  thought  of  as  random  variables,  then  uo|n  =  E[uo\y%  and  if  we  assume 
that  Un  and  Vn  are  Gaussian  then  the  uo|n  given  in  (5.13)  satisfies  the  minimum  variance 
condition.  Alternatively,  if  we  drop  the  Gaussian  assumption  and  search  of  the  best 
linear  estimator  that  minimizes  the  variance  condition,  we  find  that  uo|n  as  given  in 
(5.12)  and  (5.13)  is  the  optimal  linear  estimator.  All  these  interpretations  motivate  us 

to  use  the  estimator  given  in  (5.12)  and  (5.13). 


5.3.2  The  Kalman  filter 

We  now  have  the  form  of  an  optimal  filter  for  linear  systems.  However,  the  filter  has 
problems  computationally.  It  would  be  nice  if  there  were  a  way  so  that  new  data  could  be 
taken  into  account  easily  without  having  to  recompute  everything.  This  is  accomplished 
with  the  recursive  Kalman  filter. 

The  Kalman  filter  is  mathematically  equivalent  to  the  linear  estimator  described  in 
(5.12)  and  (5.13),  except  that  it  has  some  important  computational  advantages.  The 
basic  premise  of  the  Kalman  filter  is  that  the  state  of  the  filter  can  be  kept  with  two 
statistics,  and  Sn|n,  where  Sn|n  is  the  covariance  matrix,  E[{un  -  Un\n){un-  Un\n)^  \- 
Once  we  have  these  two  particular  statistics,  it  will  be  possible,  for  example,  to  determine 
the  next  state  of  the  filter,  Un+Hn+i  and  S„+i|n+i,  directly  given  a  new  piece  of  data, 
?/„+i,  the  filter’s  present  state,  Un\n,  En\n-,  and  knowledge  of  the  map  g. 

Specifically,  suppose  we  are  given  the  linear  system: 

Un  ~  Hn'fJ'n  "b 

where  Vn  are  independent  random  variables  with  zero  mean  and  covariance  Rn-  The 
recursive  Kalman  filter  can  be  written  in  two  parts: 

Prediction: 


«n+l|n  =  ^nUn\n  (5.14) 

+  fl,+i  (5.15) 

Combination: 

«n+l|n+l  =  Un+l\n  +  En+\{yn+\  -  Hn+lUn+l\n)  (5.16) 

Sn+l|n+l  =  {I  —  Kn+lEn+l)En+\\n 
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where  the  Kalman  gain,  .Kn+i,  is  given  by: 


■^n+l  —  ^n+l|n-^n+l  +l|n-^^J+l  +  -Rn+l] 


-1 


(5.18) 


Motivation  and  derivation 

The  Kalman  filter  can  be  motivated  in  the  following  way.®  Consider  the  metric  space, 
X,  of  random  variables  where  inner  products  and  norms  are  defined  by: 

{x,y)  =  E[xy'^] 
and  ||a;|l  =  {x,x) 

if  x,y  G  X.  Let  =  5pan{?/o, be  the  space  of  a  all  linear  combinations 
of  {yo,yi, . . .  ,yn}.  To  satisfy  the  minimum  variance  condition,  we  would  like  to  pick 
Un\n  €  Ki  to  minimize: 

E[ulun]  =  ||fin||. 

where  Un  =  Un  —  Un\n-  This  formulation  gives  a  definite  geometric  interpretation  for  the 
minimization  problem  and  helps  to  show  intuitively  what  the  appropriate  Un\n  is.  In 
order  to  minimize  the  distance  between  and  u„|„  G  1^,  it  makes  sense  to  pick  Un\n  so 
that  Un  is  orthogonal  to  Yn.  That  is,  we  require: 

(fin,J/)=0  (5.19) 

for  any  ?/  G  Ki-  It  is  not  hard  to  show  that  this  condition  is  in  fact  sufficient  to  minimize 
E[u^Un\  (see  e.g.,  [3]).  From  a  statistical  standpoint,  this  result  also  makes  sense  since  it 
says  that  the  error  of  the  estimate,  should  be  uncorrelated  with  the  measurements. 
In  some  sense,  the  estimate  uses  all  the  information  contained  in  the  measurements. 

We  can  now  derive  the  equations  of  Kalman  filter.  The  prediction  equations  are 
relatively  straightforward: 

^n+l|n  —  E\Un-\-\\y  ]  =  ^n'^n\n 

^n+l(n  .^[(^n+l|n  ^n+l|n)(^n+l|n  ^n+l|n)  ]  —  ^nEn\n^n  ^n+1  • 


For  the  estimator  u„+i|n+i  to  be  unbiased,  Un+i\n+i  must  have  the  form  given  in 
(5.16).  Now  let  us  now  verify  that  the  formula  for  Kn+i  in  (5.18)  makes  the  Kalman 
filter  an  optimal  linear  estimator.  To  do  this,  we  must  show  that  Kn+i  minimizes  the 
variance,  where  =  Un+i  -  Since  Un+i|n+i  ^  we  know 

from  (5.19)  that  a  sufficient  condition  for  E[u^^■^Un+l]■,  to  be  minimized  is  that: 

•^[^n+l^n+l|7i+l]  =  =  0.  (5.20) 

^Much  of  the  explanation  here  follows  the  exposition  in  Siapas  [56]. 
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Let  us  investigate  the  consequences  of  this  condition.  First  we  have; 

fin+l  =  ^nUn  —  [«7i+l|n  +  Kn+liVn+l  ~  Hn+lUn+l\n)] 

=  ^n'U'n  —  ^n‘Un\n  “  [•^^n+l'^n+l  +  Un+l]  +  -K^n+1 -^n+l 

=  (/  —  ^^71+1-^71+1)^71^71  -^71+1^71+1 

So, 

E[Un.irlU^+\\n-^-\\  =  E[{{I  —  Kn+\HnJr\)^nUn  —  -^^^Ti+l '^ti+I  } 

{^ti+IIti  d"  Kn+liVn+l  ~  Hn+lUn+^n)}  ] 

=  .E[{(/  -fi^Tl+l -^71+1  )^7lfi77  -^fc+l^Tl+l} 

{$7l«7l|7l  +  -^^^TI+I^TiWtI  +  -K^Tl+l^Tl+l}^]  (5.21 ) 

Since  we  require  that  E[u^Un\n]  =  Trace{E[unUn]}  —  0?  from  (5.21)  we  get  that: 
Trace{E  [mti+i  U71+1  |7i+i  ]^  } 

=  Trace{{l  —  Kn+lHn+l)^nE[Unul^]^n^n+l^n+l  ~  ^riJr\E\v.nj^\V^j^y\K^j^-^} 

—  Trace {$,iS7i|77$n-^J+l^n+l  "  —  /^t^+i i?„+l -^ti+I } 

=  Trace  {  [Sti+iIti-ZJ^j^i  ■^Ci.+i('^7i+i27i+i  171-^^71+1  +  .^+i)]-f^7i+i}- 

Thus,  choosing  Kn-\-\  =  Sn+i|7i-ffJ+i[-^f7i+iS7i+i|n.ffJ+i  +  -Rti+i]  ^  as  in  (5.18)  makes 
Trace{T[u^+iU77+i|77+i]}  =  0  and  therefore  minimizes  £^[«I+iW77+i]- 

The  equation  for  5j,i.).i|n+i  in  (5.17)  can  then  be  derived  by  simply  evaluating  Sn+iiTi^-i 


5.3.3  Nonlinear  estimation 


Probability  densities 

The  filters  we  looked  at  in  the  previous  section  are  optimal  linear  estimators  in  the 
sense  that  a  minimum  variance  or  least  squares  condition  is  satisfied.  Estimators  like 
the  Kalman  filter  are  only  optimal,  however,  if  the  system  is  linear  and  the  correspond¬ 
ing  probability  densities  are  Gaussian.  Let  us  now,  however,  consider  how  one  might 
approach  estimation  problems  when  these  rather  stringent  condition  are  relaxed. 

Let  us  begin  by  recalling  the  density  function  in  (5.10): 


P{un\y’^)  = 


P{y''\Un)P{Un) 

p{y^) 


(5.22) 


where  =  {xn,p)  is  the  joint  vector  of  state  and  parameters  and  y*  =  (yo,yi7  •  ■■■,yk) 
represents  a  vector  of  observations.  This  density  function  represents  everything  we  know 
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about  a  state  given  the  observations  specified.  Techniques  that  use  this  density  directly 
to  estimate  the  parameters  of  a  system  are  known  as  Bayesian  estimation  algorithms. 
For  example,  one  might  simply  attempt  to  pick  an  estimate,  so  that  jP(u„|y*)  is 
maximized  at  =  Un\k-  This  is  known  as  a  maximum  a  •posteriori  (MAP)  estimate. 

If  the  system,  g,  is  linear  in  (5.4)  and  all  the  a  priori  information  and  measurement 
noises  are  Gaussian,  then  the  MAP  estimator  gives  the  same  answers  as  the  Kalman 
filter  (e.g.,  see  [23]).  We  can  see  this  by  considering  how  the  appropriate  conditional 
probability  densities  get  transformed  by  the  dynamics  of  a  linear  system  and  combined 
with  new  data,  as  in  the  prediction  and  combination  steps  of  the  Kalman  filter.  For 
example,  suppose  that  the  density,  F’(u„|^”)  is  Gaussian  for  some  value  of  n  (see  fig¬ 
ure  5.1).  The  density,  P(un+i\y"'),  can  be  then  determined  from  P(u„|y”)  by  simply 


(yn+llUn+l) 


Figure  5.1:  Mapping  probability  densities  using  g  and  combining  them  with  new  information. 
This  is  a  probabilistic  view  of  what  a  recursive  estimator  like  the  Kalman  filter  does.  Note 
that  Gaussian  densities  have  equal  probability  density  surfaces  that  form  ellipsoids.  In  two 
dimensions  we  draw  the  densities  as  ellipses. 

mapping  P(u„|?/”)  using  the  system  dynamics,  g.  More  precisely  we  have  that: 

fK+.ls/“)=  E  [n^l!/")|r'5(^)r’l  (5.23) 

where  t/(«„+i)  =  {z\z  =  5'“^(«„+i)}  and  \Dg{z)\  is  the  determinant  of  the  Jacobian  of 
g  evaluated  at  2.  It  is  not  hard  to  show  that  if  g  is  linear  and  P(u„|j/")  is  Gaussian 
then  P(u„+i|i/”)  is  also  Gaussian.  Also  by  Bayes  rule,  {P{A,B)  =  P{A\B)P{B)  = 
P{B\A)P{A))  we  have  that: 

P{UnJ^l,yn+l\y'^)  =  P(u„+i|?/"+^)P(?/„+l|t/”)  =  P(?/„+i1m„+i,?/^)P(u„+i|j/”) 
where  P(j/n+i|t/")  =  / P(?/n+i |^^n+i)-P(u„+i|2:")du„+i.  Thus  we  find  that  combining  in- 


71 


formation  from  a  new  measurement,  t/„+i,  results  in  the  density; 

P(..  L,n+1>,  _  P(yn+l|«n+l)^K+lb")  (5  24) 

Since  the  denominator  is  independent  of  Un+i,  it  is  simply  a  normalizing  factor  and  is 
therefore  not  important  for  our  considerations.  Also  note  that  since  P{yn+i\'^n+i) 
P{un+i\y^)  are  Gaussian,  P(un+i|?/^+^)  must  also  be  Gaussian.  Thus,  by  induction  if 
all  the  data  is  Gaussian  distributed,  then  P{uk\y’^)  must  be  Gaussian  for  any  k.  Also, 
the  MAP  estimate  and  minimum  variance  estimate  for  u„+i  are  both  the  same,  namely 

Wn+l|n+l  =  E[Un+l\y^'^^]- 

Now  consider  what  happens  if  the  system  is  nonlinear.  The  appropriate  densities  still 
describe  all  we  know  about  the  states  and  parameters.  In  particular,  the  equations  m 
(5.23)  and  (5.24)  are  still  valid  descriptions  of  how  to  map  ahead  and  combine  densities. 
However,  in  general  there  are  no  constraints  on  the  form  of  these  densities.  As  a  practical 
matter,  the  problem  becomes  how  can  we  deal  with  these  arbitrary  probability  densities? 
How  can  one  represent  approximations  of  the  densities  in  a  computationally  tractable 
form  while  still  retaining  enough  information  to  generate  useful  estimates?  There  have 
been  a  number  of  efforts  in  this  area: 

Extended  Kalman  filter 

The  most  basic  and  widely  used  trick  is  to  simply  linearize  the  system  around  the 
best  estimate  of  the  trajectory  and  then  use  the  Kalman  filter.  The  idea  is  that  if  the 
covariances  of  the  relevant  probability  densities  are  small  enough,  then  the  system  acts 
approximately  linearly  on  the  densities,  so  linear  filtering  may  adequately  describe  the 
situation.  For  the  system, 

Un+i  =  9{un)  (5-25) 

yn+l  -  EnUn  +  Vn,  (5-26) 

as  in  (5.3),  (5.4),  and  (5.5),  the  extended  Kalman  filter  is  given  by  the  following  equa¬ 
tions,  mirroring  the  Kalman  filter  in  (5.14)-(5.18); 

Prediction: 

fin+lln  ==  ^(^nln)  (5.27) 

S„+1|„  =  Dg{Un\n)'En\nDg{Un\nV 

Combination: 

Un+l\n+l  =  Un+l\n  +  En+liVn+l  —  Hn+lUn+l\n)  (5.29) 

S„+l|n+l  =  (7  -  K„+lHn+l)Sn+l|n  (5.30) 
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(5.31) 


where  the  Kalman  gain,  Kn+i-,  is  given  by: 


+l|n  -^J+1  +  Rn+l] 


Other  work  in  nonlinear  estimation 

A  number  of  other  efforts  to  do  estimation  on  nonlinear  systems  have  concentrated 
on  developing  a  better  description  of  the  probability  densities.  For  example,  in  [23] 
methods  are  presented  that  attempt  to  take  into  account  second  order  behavior  from 
the  dynamics.  However,  the  method  still  relies  on  a  basically  Gaussian  assumption  of  the 
error  distributions,  since  it  computes  and  propagates  only  the  mean  and  covariance  ma¬ 
trices  of  densities,  adjusting  the  computations  to  account  for  errors  due  to  nonlinearity. 
Taking  into  account  higher  order  effects  in  the  densities  is  in  fact  a  difficult  proposi¬ 
tion  because  there  is  no  obvious  representation  for  these  densities.  Gaussian  densities 
are  invariant  under  linear  transformations,  and  are  especially  easy  to  deal  with  when 
it  comes  to  combining  data  from  new  measurements.  However,  similar  higher  order 
representations  do  not  exist. 

Other  methods  do  attempt  to  get  a  better  representation  of  the  error  densities.  For 
example  in  [2],  a  method  is  proposed  whereby  the  densities  are  represented  as  a  sum  of 
Gaussians.  For  example,  one  might  write: 

Si) 

i 

where  the  oij’s  represent  scalar  constants  and  A/’(u;  rrii,  S,)  evaluates  the  Gaussian  density 
function  with  mean  m,-  and  covariance  matrix  E,-  at  u.®  If  each  of  the  Gaussians  in  the 
sum  are  localized  in  state-parameter  space  (have  small  covariances)  then  we  might  be 
able  to  use  linear  filters  to  evolve  and  combine  each  density  in  the  sum  in  order  to 
generate  a  representation  of  the  entire  density. 


5.4  Applying  traditional  techniques  to  chaotic  sys¬ 
tems 

In  this  section  we  examine  why  traditional  techniques  have  a  difficult  time  performing 
high  accuracy  parameter  estimation  on  chaotic  systems.  This  investigation  will  illumi¬ 
nate  some  of  the  general  difficulties  one  encounters  when  dealing  with  chaotic  systems, 
and  will  provide  some  useful  ground  rules  for  designing  new  parameter  estimation  algo¬ 
rithms. 

Let  us  attempt,  for  example,  to  naively  apply  an  estimator  like  the  extended  Kalman 
filter  in  (5.27)-(5.31)  to  a  chaotic  system  and  see  what  problems  emerge. 

®In  other  words,  '(u-m;)  jf  jg  dimension  of  u. 

'  ^  (2jr)2  ^ 
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The  first  problem  one  is  likely  to  encounter  is  numerical  in  nature,  and  has  a  relatively 
well-known  solution.  It  turns  out  that  the  formulation  in  (5.27)-(5.31)  is  not  numerically 
sound.  The  problems  are  especially  bad,  however,  in  chaotic  systems  because  covariance 
matrices  become  ill-conditioned  quickly  as  densities  are  stretched  exponentially  along 
unstable  manifolds  and  contracted  exponentially  along  stable  manifolds.  Similar  sorts 
of  problems,  albeit  less  severe,  have  been  encountered  and  dealt  with  by  conventional 
filtering  theory.  One  solution  is  to  represent  the  covariance  matrix  S„|n  as  the  product 
of  two  matrices: 

Sn|n  =  (5.32) 

and  propagate  the  matrices  Sn\n  instead  of  Sn|n.  These  estimation  techniques,  known 
as  square  root  algorithms,  are  mathematically  the  same  as  the  Kalman  filter,  but  have 
the  advantage  that  they  are  less  sensitive  to  ill-conditioned  covariance  matrices.  Using 
square  root  algorithms,  for  instance,  the  resulting  covariance  matrices  are  assured  to 
remain  positive  definite.  Since  the  decomposition  in  (5.32)  is  not  unique,  there  are 
a  number  of  possible  implementations  for  such  algorithms.  The  reader  is  referred  to 
Kaminski  [31]  and  related  papers  for  detailed  implementation  descriptions.^ 

Other  problems  result  from  the  nonlinearity  of  the  system.  Some  of  these  probleins 
can  be  observed  in  general  nonlinear  systems,  while  others  seem  to  be  unique  to  chaotic 
systems.  First  of  all,  using  a  linearized  parameter  estimation  technique  on  any  nonlin¬ 
ear  system  can  cause  trouble,  even  if  the  system  is  not  chaotic.  Often  errors  due  to 
nonlinearity  cause  the  filter  to  become  too  confident  in  its  estimates,  which  prevents  the 
filter  from  updating  its  information  correctly  based  on  new  data  and  eventually  locks 
the  filter  into  a  parameter  estimate  with  larger  error  than  expected.  This  phenomenon  is 
known  as  divergence.*  It  is  not  hard  to  see  why  divergence  can  become  a  problem  with 
estimators  like  the  Kalman  filter.  For  example,  in  the  linear  Kalman  filter,  note  that 
the  the  estimation  error  covariance  matrix,  S„|„,  can  actually  be  precomputed  without 
knowledge  of  the  data.  In  other  words  there  is  no  feedback  between  the  actual  perfor¬ 
mance  of  the  filter  and  the  filter’s  estimate  of  its  own  accuracy.  In  the  extended  Kalman 
filter  there  is  also  virtually  no  feedback  between  the  observed  residuals,  yn  -  HnUn,  and 
the  computed  covariance  matrix,  S„|n. 

The  divergence  problem  is  considerably  worse  in  nonuniformly  hyperbolic  systems 
than  it  is  in  other  nonlinear  applications.  This  is  because  folding,  a  highly  nonlinear 
phenomenon,  is  crucial  to  parameter  estimation.  While  linearized  strategies  may  do  rea¬ 
sonably  well  following  most  chaotic  trajectories  if  the  uncertainty  variances  are  small, 
linearized  techniques  invariably  have  great  trouble  with  the  sections  of  trajectories  that 
are  most  sensitive  to  parameter  perturbations.  Figure  5.2  gives  a  schematic  of  what 
happens  when  folding  occurs.  The  linearized  probability  densities  in  that  case  become 

■^In  this  report,  whenever  we  refer  to  numerical  results  using  square  root  filtering  techniques,  the 
implementation  we  use  is  the  one  given  in  [31]  labeled  “Square  Root  Covariance  II.” 

®See  for  example,  Ljung  [41]  for  discussion  of  some  related  work. 
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poor  approximations  to  the  real  densities.  Note  that  the  composite  densities  look  ex¬ 
tremely  long  and  thin  because  the  densities  have  gotten  stretched  and  contracted  along 
unstable  and  stable  manifolds. 


Figure  5.2:  In  this  picture  we  show  a  typical  example  of  what  can  happen  to  probability 
densities  in  chaotic  systems.  Because  of  the  etfects  of  local  folding,  linear  filters  like  the 
Kalman  filter  sometimes  have  difficulty  tracking  nonuniformly  hyperbolic  dynamical  systems. 

In  chapter  6,  we  show  some  examples  of  the  performance  of  the  square  root  extended 
Kalman  filter  on  various  maps.  The  filter  generally  performs  reasonably  well  at  first 
but  eventually  diverges  as  the  trajectory  it  is  tracking  passes  close  to  a  folding  area.  As 
we  observed  earlier,  once  the  extended  Kalman  filter  becomes  too  confident  about  its 
estimate,  it  generally  cannot  recover.  While  various  ad  hoc  techniques  can  make  small 
improvements  to  this  problem,  none  of  the  standard  techniques  I  encountered  did  an 
adequate  job  of  handling  the  folding.  For  example,  consider  the  case  of  the  Gaussian 
sum  filter,  which  is  basically  the  only  method  that  one  might  expect  to  have  a  chance  at 
modeling  the  folding  behavior.  Note  that  the  densities  in  the  Gaussian  sum  have  to  be 
re-decomposed  into  constituent  Gaussians  every  few  iterations  because  of  spreading,  as 
expansion  along  unstable  manifolds  quickly  pushes  most  of  the  constituent  densities  out 
into  regions  of  near  zero  probability.  In  addition,  the  position  of  the  apex  of  the  fold, 
which  is  crucial  to  estimating  the  correct  parameters,  is  quite  difficult  to  get  a  handle 
on  without  including  many  terms  in  the  representation  of  the  density. 


5.5  An  algorithm  in  one  dimension 

In  the  previous  section  we  saw  that  traditional  techniques  do  not  seem  to  do  a  reasonable 
job  modeling  the  effects  of  folding  on  parameter  estimation.  Since  there  seems  to  be 
no  simple  way  of  adequately  representing  a  probability  density  as  it  gets  folded,  we 
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resort  to  a  Monte  Carlo  representation  of  densities  near  folded  regions,  meaning  that 
the  appropriate  densities  are  sampled  at  many  different  points  in  state  and  parameter 
space  and  this  data  is  used  as  a  representation  for  the  density  itself.  The  eventual  hope  is 
that  we  will  only  have  to  examine  a  fraction  of  the  data  using  computationally-intensive 
techniques  like  Monte  Carlo,  since  we  know  that  only  a  few  sections  of  data  are  really 
sensitive  to  parameter  values. 

Though  the  ideas  are  simple,  the  actual  implementation  of  such  parameter  estimation 
techniques  is  not  as  easy  one  might  think  because  of  numerical  problems  associated  with 
chaotic  systems.  In  this  section  we  examine  the  basics  of  how  to  apply  Monte  Carlo-type 
analysis  to  chaotic  systems  by  looking  at  an  algorithm  for  one-dimensional  noninvertible 
systems.  An  algorithm  for  higher  dimensional  invertible  systems  will  be  considered  in 
section  5.6. 


5.5.1  Motivation 

Let  us  consider  the  following  question.  Suppose  we  are  given  a  family  of  maps  of  the 
interval,  /p  :  4  4,  for  p  e  Ip  and  noisy  measurement  data,  {j/n},  such  that: 

—  fpo  (®n) 

and  yn  =  arn  +  u„, 

where  €  4  for  all  n,  4  C  K,  and  po  €  4  C  R  such  that  fp,  is  chaotic.  Suppose  also 
that  the  VnS  are  zero  mean  Gaussian  independent  variables  with  covariance  matrix,  Rn, 
and  that  we  have  some  a  prion' knowledge  about  the  value  of  po-  Given  this  information, 
we  would  like  to  use  the  state  samples,  {j/n}?  fo  get  a  better  estimate  of  po.  Let  us 
assume  for  the  moment  that  we  have  plenty  of  computing  power  and  time.  What  sort 
of  method  is  likely  to  extract  the  most  possible  information  about  the  parameters  of  the 
system  given  the  state  data? 

The  first  thing  one  might  try  is  to  simply  start  picking  parameter  values,  p,  near  po 
and  initial  conditions,  a:,  near  yo,  and  attempt  to  iterate  orbits  of  the  form  {/p(a;)},_o 
to  see  if  they  come  close  to  {yi}Z=o-  If  uo  orbit  of  fp  follows  {yi}"=o  then  we  know  that 
Po  7^  p.  As  we  increase  n,  many  orbits  of  the  form  {/p(x)}”_o  diverge  from  {pj}”_o, 
and  we  can  gradually  discard  more  and  more  values  of  p  as  candidates  for  the  actual 
parameter  value,  po- 


5.5.2  Overview 

In  order  to  implement  this  idea,  we  first  need  some  criteria  for  measuring  how  close  orbits 
of  fp  follow  {yi}  and  some  rules  for  how  to  use  this  information  to  decide  whether  the 
parameter  value,  p,  should  remain  a  candidate  for  our  estimate  of  po.  Basically,  we  want 
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p  to  be  eliminated  if  the  best  shadowing  orbit,  {/p(a:)},  of  fp  is  far  enough  away  from 
{yi}  that  it  is  highly  unlikely  that  sampling  {/‘(x)}  could  have  resulted  in  {p,},  given 
the  expected  measurement  noise.  As  discussed  earlier,  one  way  to  do  this  is  to  think 
of  yn,  and  po  as  random  variables  and  to  consider  a  probability  density  function  of 
the  form,  P(a;o,po|y")-  Our  goal  will  be  to  numerically  sample  such  probability  densities 
and  use  the  results  to  extract  information  about  the  parameters.  This  is  accomplished 
in  stages,  since  we  can  only  reliably  compute  orbits  for  a  limited  number  of  iterates  at 
once.  Information  from  various  stages  can  then  be  combined  to  construct  the  composite 
density,  P(xo,po|j/"),  for  increasing  values  of  n. 

So,  for  example,  let  us  examine  how  to  analyze  the  kth  stage  of  observations,  con¬ 
sisting  of  the  data,  {yi}^^'^^  j  where  Nk+i  is  chosen  to  be  as  far  away  from  Nk  as  possible 

without  greatly  affecting  the  numerical  computation  of  orbits  shadowing  {yi}^^^^-  Let 
y[a,  b]  =  {ya,ya+i,  •  •  • ,  J/i),  be  a  vector  of  state  data.  We  begin  by  picking  values  of  p  near 
Po.  For  each  of  these  parameter  samples,  p,  we  pick  a  number  of  initial  conditions,  x, 
and  iterate  out  orbits  of  the  form  {fp{x)}2=N^  for  n  >  Nk  to  evaluate  P{xn^  |po,  y[Nk-,  n]) 
for  increasing  values  of  n.^ 

For  each  n  >  Nk  yve  want  to  keep  track  of  the  set  of  initial  conditions  xq  6  Ip  such 
that  P{xNjpo,y[Nk,n])  is  above  a  threshold  value.  If  P{xNjpo,y[Nk.,n])  is  below  the 
threshold  for  some  value  of  we  discard  the  orbit  {/p(xjv*)}"=o  because  it  is  too 
far  from  and  attempt  to  repopulate  a  region,  Uk{p,n)  C  Ix,  in  state  space  with 

more  initial  conditions,  where  Uk{p,Ti)  is  constrained  so  that  x  G  Uk{p,n)  implies  that 
P{xNi,\po,y[Nk,n])  is  above  the  threshold.  Some  care  must  be  taken  in  figuring  out  how 
to  choose  Uk{p,n)  so  that  new  initial  conditions  can  be  generated  effectively.  Without 
care,  these  regions  develop  Cantor-set-like  structure  that  is  difficult  to  deal  with. 

After  collecting  information  from  various  stages,  we  then  recursively  combine  the 
information  from  consecutive  stages  (similar  to  probabilistically  combining  densities  in 
the  Kalman  filter)  in  order  to  determine  the  appropriate  overall  statistics  for  concate¬ 
nated  orbits  over  multiple  stages.  After  combining  information,  at  the  end  of  each  stage 
we  also  take  a  look  at  the  composite  densities  for  the  various  parameter  samples,  p. 
Values  of  p  whose  densities  are  too  low  are  thrown  out,  since  this  means  that  fp  has 
no  orbits  which  closely  shadow  The  surviving  parameter  set,  i.e.,  the  set  in 

parameter  space  still  being  considered  for  the  parameter  estimate,  must  then  be  repopu¬ 
lated  with  new  parameter  samples.  The  statistics  of  the  new  parameter  samples  may  be 
determined  through  a  combination  of  interpolation  with  nearby  parameter  samples  and 
recomputation  of  the  statistics  of  nearby  stages.  Because  of  the  asymmetrical  behavior  in 

®Note  that  P{xN^\po,y[Nk,n])  is  sufficient  to  determine  PfxATi.Poly")  for  any  particular  value  ofp, 
since 


=  PixN^\po,lP)P{po) 


where  P{po)  is  a  normalizing  factor  quantifying  a  priori  information  about  the  parameters. 
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shadowing  discussed  in  chapters  3  and  4,  we  find  that  P{xo,po\y^'’)  generally  has  an  ex¬ 
tremely  asymmetrical  structure  with  respect  to  p.  Specifically,  the  density  P{xo,po\y  *) 
generally  drops  off  extremely  rapidly  for  parameters  at  either  the  higher  or  lower  end  of 
the  surviving  parameter  range  (see  numerical  results  in  section  6.1).  This  allows  us  to 
get  an  extremely  accurate  parameter  estimate  for  po  by  simply  choosing  our  estimate, 
PNk+i  i  to  be  the  extremum  of  the  surviving  parameter  range  where  the  density  drops  off 
rapidly. 

A  block  diagram  summarizing  the  main  steps  in  algorithm  is  shown  in  figure  5.3. 


Figure  5.3:  This  block  diagram  illustrates  the  main  steps  in  the  proposed  estimation  algorithm 
for  one-dimensional  systems.  The  algorithm  breaks  up  the  data  in  sections  caUed  “stages.” 
The  diagram  above  shows  the  basic  steps  the  algorithm  takes  in  analyzing  each  stage  of  data. 
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5.5.3  Implementation 


Below  we  explain  various  aspects  of  the  algorithm  in  more  depth.  Note  that  unless 
otherwise  indicated,  yn,  and  po  refer  to  random  variables  in  the  discussion  below. 

Evaluating  probability  densities 

The  first  thing  we  must  address  is  how  to  compute  the  values  of  relevant  densities. 
From  (5.24)  we  have  that: 


Piyn\xo,Po)P{xo,Po\y'' 


P(y^lyn-l) 

Expanding  the  right  hand  side  of  this  equation  recursively  we  have: 

Pixo,Po\y'')  =  KiP{xo,po)flN{yi;fp^{xo),Ri) 


(5.33) 


(5.34) 


where  Ki  is  some  constant  and  P{xo,po)  is  the  probability  density  representing  a  pri¬ 
ori  knowledge  about  the  values  of  xq  and  po,  while  N{fp^{xo);yi,  Ri)  is  the  value  of  a 
Gaussian  density  with  mean  /pg(xo)  and  covariance  matrix  Ri  evaluated  at  t/,-.  In  the 
limit  where  no  a  priori  knowledge  about  xq  is  available,  the  weighting  factor,  P(xo,po)) 
reduces  to  F(po),  reflecting  a  priori  information  about  the  parameters.  Then,  taking 
the  natural  log  of  (5.34)  we  get  that: 


log[F(a:o,Pol2/")]  =  +  log[P(po)]  -  -  l^(/p(a:o)  -  yif  R~\fl^ixo)  -  yi).  (5.35) 

^  «=o 

where  K2  is  a  constant.  Note  that  except  for  the  extra  term  corresponding  to  the  a 
priori  distribution  for  po,  maximizing  (5.35)  is  essentially  the  same  as  minimizing  a  least 
squares  criterion.  Also  note  that  for  any  particular  value  of  po  we  have  from  (5.35)  that: 

log[P(a;o|po,3/")]  =  log[P(a:o,pob”)]  -  log[F(po)] 

=  ^^-\'Ilirvi^o)-yif  R~^{fl{xo)-yi).  (5.36) 


Representing  and  dividing  state  regions 

Given  a  parameter  sample,  po,  and  stage,  A:,  we  need  to  specify  how  to  choose  sam¬ 
ple  trajectories,  to  shadow  {yJ-LiVfc  for  «  ^  {Nk,Nk  -b  l,...,Nk+i}. 

For  each  n  6  {Nk,Nk  +  1,...,Na:+i}  we  want  to  keep  track  of  the  set  of  interesting 
initial  conditions,  £4(po,n)  C  Ix,  from  which  to  choose  states,  x^^,  to  evaluate  the  den¬ 
sity,  P{xNjpo,y[Nk,n]).  We  require  that  if  xjv*  €  Uk{po,n),  then  xn^  must  satisfy  the 
following  thresholding  condition: 

log[P(xArJpo,2/[A'A,n])]  >  sup  {log[P(xjvJpo,2/[A’A,n])]}  -  0-2  (5.37) 

€lx 
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for  some  constant,  a  >  0  so  that  the  orbit,  follows  sufficiently  close  to 

can  be  interpreted,  to  be  a  measure  of  the  maximum  number  of  standard 
deviations  is  allowed  to  be  from  the  best  shadowing  orbit  of  the  map,  fpQ-  This 
interpretation  arises  since  if  P{xNi^\po,y”')  were  Gaussian,  condition  (5.37)  would  be 
satisfied  by  all  states,  within  a  standard  deviations  of  the  mean,  XNi^{po,ri)  = 

L  eix  ^NkP{^Nk  bo,  n])dx}^  To  be  reasonably  sure  we  don’t  accidentally  eliminate 
important  shadowing  orbits  of  /po  close  to  {yi},  we  might  choose,  for  example,  for  cr  to 
be  between  8  and  12. 

Given  a  parameter  sample,  po,  let  14(po,w)  C  Ix  represent  the  set  of  all  € 
4  satisfying  (5.37).  Recall  that  Uk{po,n)  represents  the  set  of  points  from  which  we 
will  choose  new  sample  initial  conditions,  We  know  that  we  want  Uk{poj'‘^)  C 

Va:(po,^),  but  problems  arise  if  we  always  attempt  to  saturate  the  set  Vk{po,n)  with 
sample  trajectories.  For  low  values  of  n,  Vk{po,n)  is  an  interval.  In  this  case,  let 
Uk{po-,n)  =  Vk{po,n)  and  we  can  simply  choose  initial  conditions,  x^k^  random  inside 
Vk{pQtn)  to  generate  samples  of  P{xNi,\Po,y[Nk,n]).  As  n  gets  larger,  Vkipoifi)  tends 
to  shrink  as  expands  regions  in  state  space  and  more  trajectory  samples  get 

discarded  from^consideration  for  failing  to  satisfy  (5.37).  However,  as  long  as  Vkipo,ri) 
is  an  interval,  continue  to  set  Uk{po,n)  =  T4(po,«),  since  it  is  not  hard  to  keep  track  of 
I4(po,n)  to  repopulate  the  region  with  new  trajectory  samples. 

A  problem  occurs,  however,  because  of  the  folding  around  turning  points.  If  the 
region,  f^{Vk{po,'m)),  contains  a  turning  point  for  some  integer  m  >  0,  then  as  n  grows 
larger  than  m,  Vk{po,n)  may  split  into  two  distinct  intervals,  V^{po,n)  and  Vk{po,n)- 
Folding  causes  the  two  separate  regions  to  get  mapped  into  each  other  by  (i-e-, 

/”i+i(y+(po,n))  =  In  addition,  the  new  intervals,  V,^{po,n)  and 

I4"(po,«),  can  also  be  split  apart  into  other  separate  intervals  by  similar  means  as  n 
increases.  In  principle,  this  sort  of  phenomenon  can  happen  arbitrarily  many  times, 
turning  I4(po,«)  into  a  collection  of  thin,  disjoint  intervals.  This  makes  it  difficult 
to  keep  up  with  a  characterization  of  Vk{po,n)^  and  makes  it  difficult  to  know  how  to 
choose  new  initial  conditions,  £  I4(n,p),  to  replace  trajectory  samples  that  have 
been  eliminated. 

Instead  of  attempting  to  keep  up  with  all  the  separate  areas  of  14(^0, n),  and  trying 
to  repopulate  all  these  areas  with  new  state  samples,  we  let  Ukipo^n)  C  Vk{po,n)  be 
the  single  connected  interval  of  Vk{po,n)  where  P{xj^i^\po,y[Nk,'n])  is  a  maximum.  We 

^°One  might  think  that  this  Gaussian  assumption  may  be  a  bad  one  and  that  in  general  we  might,  for 
instance,  want  to  make  sure  that  we  kept  a  set,  Q,  of  initial  states  such  that  Pr{xN^  G  Q|po)  >  1  —  o; 
for  a  >  0  small,  where  Pr{X)  is  the  probability  of  event  X.  However,  in  practice,  the  condition  (5.37) 
is  simpler  to  evaluate  and  works  well  for  all  the  problems  encountered.  The  choice  of  thresholding  value 
is  not  critically  important  as  long  cis  it  is  not  so  high  that  close  shadowing  orbits  are  thrown  away  from 
consideration. 

“Strictly  speaking  we  actually  want  to  maximize  P(a;jVfc_i|po,J/[Afc_i,iV^fc])P(xjVfc|po,J/[Aj:,n]),  (see 
the  section  on  how  to  combine  data).  In  practice  this  almost  always  amounts  to  maximizing 
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know  that  the  separate  areas  of  Vk^PoiTi)  eventually  get  mapped  into  each  other,  so 
there  is  no  way  that  one  of  the  separate  areas  of  Vk{poin)  can  end  up  shadowing  {t/,} 
if  no  states  in  Ukipo^n)  can  shadow  {pi}.  Since  we  are  primarily  interested  in  the  best 
shadowing  orbit  of  /p^,  keeping  up  with  orbits  with  initial  conditions  in  Uk{po,n)  is 
adequate. 

Finally,  note  also  that  it  is  sometimes  obvious  that  the  parameter  sample,  po,  cannot 
possibly  be  the  correct  parameter  value.  This  happens  if  no  orbit  of  fp^  comes  anywhere 
close  to  shadowing  {pi}-  In  this  case  we  can  immediately  discard  parameter  sample,  po, 
from  consideration. 

Deciding  what  parameters  to  keep 

We  need  to  evaluate  how  good  a  parameter  sample  is,  so  we  know  which  parameter 
samples  to  keep  and  which  parameters  to  eliminate  as  a  possible  choice  for  the  parameter 
estimate.  After  the  completion  of  stage  k,  we  evaluate  a  parameter  sample,  po,  according 
to  the  following  criterion: 

Lk+iipo)  =  sup  {log[P(a;Arfc,Po|y^'=+')]}  (5-38) 

which  is  what  one  would  expect  if  we  were  interested  in  obtaining  a  MAP  estimate.  Let 
Vk  be  the  set  of  parameter  samples  valid  at  the  start  of  the  kth.  stage.  We  will  eliminate 
a  parameter  sample,  po,  after  the  kth.  stage  if  it  satisfies  the  following  formula: 

Lk+i{po)  <  sup  {Lk+i{p')}  - 

p'€Vk 

where  cr  >  0  is  some  measure  of  the  number  of  standard  deviations  p  is  allowed  to  be 
from  the  most  likely  parameter  value. 

Choosing  the  number  of  iterates  per  stage 

The  necessity  of  breaking  up  orbits  into  stages  is  apparent,  since  orbits  can  be  reliably 
computed  only  for  a  limited  number  of  iterates.  We  now  explain  how  to  determine  the 
number  of  iterates  in  each  stage.  Let  pMAp{k),  be  the  MAP  estimate  for  po,  at  the 
beginning  of  stage  k  {ie  p  =  pMAp{k)  is  the  parameter  sample  that  maximizes  Lk{p)  for 
any  p  €  Vk)-  We  want  to  choose  Nk+i  to  be  as  large  as  possible  provided  we  are  still 
able  to  reliably  compute  orbits  of  the  form  {fpg{xNk)}^=o^~^'’  to  shadow 

Suppose  that  €  Uk{po,Ti).  A  reasonable  measure  of  the  number  of  iterates  we 
can  reliably  compute  for  an  orbit  like  is  given  by  the  size  of  Uk{po^n)-  If 

Uk{po,n)  is  small,  this  implies  that  small  changes  or  errors  in  initial  state  get  magnified 
to  magnitudes  on  the  order  of  the  measurement  noise.  Since  we  need  to  compute  states 
to  accuracies  better  than  the  measurement  noise,  it  makes  sense  to  pick  so  that 
Uk{pQ^  Nk+x)  is  a  few  orders  of  magnitude  above  the  precision  of  the  computer. 

P{xNjpo,y[Nk,n])  because  Uk{po,n)  is  generally  much  smaller  than  f^'‘~^''-^{Uk-i{po,  Nk))- 
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One  complication  that  can  arise,  is  that  the  sequence  of  states,  3/iv*+i5  •  ■  • 

might  correspond  to  an  especially  parameter-sensitive  stretch  of  points,  so  that  there 
may  be  no  orbit  of  fp^ApW  t^^-t  shadows  the  data,  {yi}f=N^-  In  case,  we  cannot  use 
the  size  of  Uk{pMAp{k),n)  to  determine  iVfc+i.  Instead  of  using  pMAp{k)  pick  the  next 
best  parameter  sample  in  Vk,  p'{k),  where  p'{k)  maximizes  LnM  any  p  €  Pfc,  besides 
PMAp{k).  We  then  try  to  play  the  same  procedure  with  p'  that  we  described  for  PMAp{k). 
Similarly,  if  fp>  cannot  shadow  the  data  choose  another  parameter  value  from  Tk,  and 
so  forth.  Eventually  some  parameter  value  in  Vk  must  work,  or  else  either:  (1)  there  are 
not  enough  parameter  samples,  or  (2)  po  is  not  in  the  parameter  space  region  specified 
upon  entrance  to  the  fcth  stage.  This  can  be  especially  be  a  problem  at  the  beginning  of 
the  estimation  process  when  the  parameters  are  not  known  well,  and  parameter  samples 
are  more  sparse  in  parameter  space.  The  solution  is  to  choose  parameters  intelligently, 
choosing  varying  numbers  of  parameter  samples  in  different  regions  of  parameter  space 
and  in  different  situations  (for  example,  to  initialize  the  estimation  routine). 

Combining  data  from  stages 

As  in  the  Kalman  filter,  we  want  to  build  a  recursive  algorithm  so  that  data  sum¬ 
marizing  information  for  stages  1  through  k  —  I  can  be  combined  with  information 
from  stage  k  to  produce  results  which  summarize  all  knowledge  about  stages  1  through 
k.  Specifically,  suppose  that  y[Nk,Nk+i]  =  (yjVfc,J/Arfc+i>- ••  represents  the  state 

samples  of  the  kth  stage.  We  propose  to  compute  Lfc+i  (po)  using  information  given  in 
Lkipo),  PixN,.,  \po,y[Nk-i,  Nk]),  and  P{xN,,Po\y[Nk,  Nk+^]).  Then  all  information  about 
stages  1  through  k  can  be  represented  by  Lk+iipo)  and  P(xjV*1po)  -^fc+i])* 

From  (5.38)  we  see  that  Lfc(po)  depends  only  on  P{xNk_,,po\y^'’)  evaluated  on  the 
orbit  that  best  shadows  the  first  Nk  state  samples.  In  other  words  if  {xi|jvJi=o  is  the 
best  shadowing  orbit  based  on  the  first  Nk  state  samples,  then  from  (5.38)  and  (5.35): 

Lk{po)  =  log[^’(2:iVfc_j  = 

-I  Nh 

=  K2  +  log[P(po)]  -  -  X!(^i|iV*  -  yiV Pi  -  Vi)-  (5.39) 

^  i=0 

One  key  thing  to  notice  is  that  Uk-i{poi  ^k)  and  Uk{po,  Nk+i)  should  be  very  small 
compared  to  the  measurement  noise,  for  any  i.  This  is  a  reasonable  assumption  as 
long  as  none  of  the  measurements  have  relative  accuracies  on  the  order  of  the  machine 
precision.  Therefore  we  can  approximate  Xi\Nk+i  with  Xi\Nk  for  i  G  {0, 1, . . . ,  in 

(5.39)  and  if  we  let: 

Afc(po)  =  log[P(po)]  -  «  X)  ~  yi) 

^  i=0 

Then  from  (5.36),  (5.39),  and  (5.40): 

Lfc(po)  ~  ^fc(Po)  +  sup  {log[P{xNk_^\po,y[Nk-i,Nk])]}  (5-41) 

XNk-i 
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and  also: 


I  Nk 

Lk+i{po)  w  Ak{po)--  -  Vif  RT\ii\N,+,  -  Vi) 

^  i=Nk-i 

+  sup  {log[P(a:ArJpo,y[iVA,A/’fc+i])]}.  (5.42) 

We  can  now  evaluate  (5.42)  given  the  appropriate  representations  of  Lk{p),  |po,  y[Nk-\^Nk]), 

and  P{xn^  Ipo,  y[Nki  J^^Jt+i])-  The  term  on  the  right  hand  side  of  (5.42)  involving  sup^^ 
can  be  approximated  from  our  representation  of  the  density  P{xNi^\pQ,y[Nk.,  Nk+i])  by 
simply  taking  the  maximum  density  value  over  all  the  trajectory  samples.  Likewise 
Ak{po)  can  be  evaluated  from  (5.41)  in  a  similar  manner  given  Lk{po)-  The  trajectory 
{^*Wfc+i}£vk_i  can  be  approximated  by  looking  for  trajectory  sample  x'  6  Uk-i{po,Nk) 
in  the  representation  for  P{xN^_^\pQ,y[Nk-i, Nk])  that  makes  /^*“'^*“^(a:')  as  close  to 
Ukipo,  Nk+i)  as  possible.  Then  let  Xi|7v*+i  =  ^r  i  6  {Nk-i, . . . ,  Nk}. 

Note  that  this  assumes  that  f4(po,  Nk+i)  C  (Pk-i(po,  Nk)).  If  this  is  not  true 

then  no  orbit  of  fp^  adequately  shadows  we  can  throw  out  the  parameter 

sample  po- 

Choosing  new  parameter  samples  and  evaluating  associated  densities 

Once  a  parameter  sample  is  deleted  because  it  does  not  satisfy  (5.37),  a  new  parame¬ 
ter  sample  must  be  chosen  along  with  the  appropriate  statistics  and  densities.  We  want 
choose  new  parameters  after  stage  k  so  that  they  adequately  describe  Lk+i{p)  over  the 
surviving  parameter  range.  In  other  words  we  attempt  to  choose  new  parameters  to  fill 
in  gaps  in  parameter  space  where  nearby  parameter  samples,  pi  and  p2,  for  example, 
have  very  different  values  of  Lk+i{pi)  and  Lk+i{p2). 

Once  we  choose  the  new  parameter  sample,  p*,  we  need  to  evaluate  the  relevant 
statistics,  namely  Lk+i{p*)  and  P(a:ArJpo  =  p*, p[fVfc,  A^A;+i]).  We  could,  of  course,  do 
this  by  going  back  through  all  of  data  {p,}^o  ^  and  sampling  the  appropriate  densities. 

This,  however,  would  be  quite  time-consuming,  and  would  likely  not  reveal  much  more 
information  about  the  parameters  than  we  could  get  by  much  simpler  means,  assuming 
that  enough  parameter  samples  are  used.  Instead,  we  interpolate  Ak{p*)  given  Ak{p) 
for  all  valid  parameter  samples,  p  e  Tk-  We  then  compute  P{xN^_^\po,y[Nk-i,Nk])  and 
P{xNk\Po:y[Nk,  Nk+i])  by  iterating  trajectory  samples.  We  can  then  evaluate  Lk+i(p*) 
according  to  (5.42). 

Efficiency  concerns 

This  algorithm  is  not  designed  to  be  especially  efficient.  Rather,  it  is  intended  to 
try  to  extract  as  much  information  about  the  parameters  of  a  one-dimensional  map  as 
reasonably  possible.  For  a  discussion  of  some  performance  issues,  see  the  next  section 
where  we  apply  the  algorithm  to  the  family  of  quadratic  maps. 
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One  way  to  increase  the  efficiency  of  this  algorithm  would  be  to  attempt  to  locate  the 
sections  of  the  data  orbit  that  are  sensitive  to  parameters,  and  perform  the  appropriate 
analysis  only  on  these  observations.  For  maps  of  the  interval  this  corresponds  to  locating 
sections  of  orbit  that  pass  near  turning  points.  The  problem,  however,  is  not  as  obvious 
in  higher  dimensions.  Rather  than  address  this  issue  in  a  one-dimensional  setting,  in 
section  5.6  we  will  look  at  how  this  might  be  done  in  higher  dimensional  systems  using 
linear  analyses. 


5.6  Algorithms  for  higher  dimensional  systems 


In  this  section  we  develop  an  algorithm  to  estimate  the  parameters  of  general  nonuni- 
formly  hyperbolic  systems.  Suppose  we  are  given  a  family  of  maps,  fp  :  M  M,  for 
p  E  Ip  and  noisy  measurement  data,  {t/n},  where: 

^n+l  —  fpo  (^n) 
and  pn  =  -f  Vn 

where  Xn  €  M  for  all  n,  M  is  some  metric  space,  and  po  £  Ip  C  ^  such  that  fp^  is 
nonuniformly  hyperbolic.  Suppose  also  that  the  u^’s  are  zero  mean  Gaussian  independent 
random  variables  with  covariance  matrix,  Rn,  and  that  we  have  some  a  prion  knowledge 
about  the  value  of  po.  Our  goal  in  this  section  is  to  develop  an  algorithm  to  estimate  po 

given  {pn}. 

Like  the  algorithm  for  one-dimensional  systems  discussed  in  the  last  section,  the 
estimation  technique  presented  here  is  based  on  an  analysis  of  probability  densities  using 
a  Monte- Carlo-like  approach.  The  idea,  however,  is  to  avoid  the  heavy  computational 
burden  typical  of  Monte  Carlo  methods  by  selectively  choosing  which  pieces  of  data 
to  fully  analyze.  Since  most  of  the  state  data  in  a  nonuniformly  hyperbolic  systems 
apparently  do  not  contribute  much  information  about  the  parameters  of  the  system,  the 
objective  is  to  quickly  bypass  the  vast  majority  of  data,  but  still  construct  extremely 
accurate  parameter  estimates  by  performing  intensive  analyses  on  the  small  sections  of 
data  that  really  matter. 


5.6.1  Overview 

The  parameter  estimation  algorithm  has  two  primary  components.  The  first  component 
sifts  through  the  data  to  locate  orbit  sections  that  might  be  sensitive.  The  second 
component  performs  an  analysis  on  the  parameter-sensitive  data  sections  to  determine 
the  parameter  estimate. 

The  data  is  first  scanned  using  a  linear  estimator  like  the  square  root  extended 
Kalman  filter.  As  described  in  chapter  4,  linear  analyses  can  indicate  the  presence  of 
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degeneracy  in  the  hyperbolic  structure  of  a  system.  In  the  case  of  a  recursive  linear  filter, 
degeneracies  corresponding  to  parameter-sensitive  stretches  of  data  are  indicated  by  a 
sharp  drop  in  the  covariance  matrix  of  the  estimate.  We  simply  run  the  data  through 
the  appropriate  filter,  look  for  a  drop  in  covariance  estimate  over  a  small  number  of 
iterates,  and  note  the  appropriate  sections  of  data  for  further  analysis. 

The  second  component  of  the  estimation  technique  consists  of  Monte-Carlo-based 
technique.  The  underlying  basis  for  this  analysis  is  similar  to  what  was  described  in 
section  5.5  for  one-dimensional  systems.  Basically  the  estimate  is  constructed  by  using 
information  obtained  by  sampling  the  appropriate  probability  densities  in  state  and 
parameter  space.  There  are,  however,  a  few  important  differences  to  point  out  from  the 
one-dimensional  algorithm.  First,  since  the  systems  are  invertible,  we  iterate  the  map 
both  forwards  and  backwards  in  time’^^  in  order  to  obtain  information  about  probability 
densities.  Also  the  higher  dimensionality  of  the  systems  causes  a  few  problems  with  how 
to  represent  and  choose  regions  of  state  space  in  which  to  generate  samples.  Finally 
instead  of  concatenating  consecutive  stages  by  matching  initial  and  final  conditions  of 
sample  trajectories,  we  generate  only  one  stage  for  each  section  of  sensitive  state  data. 
The  stages  are  separated  in  space  and  time,  so  there  is  no  matching  of  initial  and  final 
conditions. 

5.6.2  Implementation 

In  this  section  we  detail  some  of  the  basic  issues  that  need  to  be  addressed  in  order  to 
implement  the  proposed  algorithm. 

Top-level  scan  filter 

The  data  is  first  scanned  by  a  square  root  extended  Kalman  filter.  The  implementa¬ 
tion  is  straightforward:  simply  process  the  data  and  look  for  drops  in  the  error  covariance 
matrix.  There  are  two  parameters  that  may  be  adjusted:  (1)  a  parameter,  N,  to  set  the 
number  of  iterates  (time  scale)  to  look  for  degeneracies,  (2)  a  parameter,  a,  to  set  the 
threshold  that  governs  whether  a  section  of  data  is  sent  to  the  Monte-Carlo  algorithm 
for  further  analysis,  a  is  expressed  in  terms  of  a  ratio  of  the  square  roots  of  the  variances 
of  the  parameter  error. 

Evaluating  densities 

Let  I/"  =  {yQ,yi,  ■  ■ .  ,yn)-  To  estimate  parameters,  we  are  interested  in  densities  of 

^^For  lack  of  a  better  term  we  use  “time”  to  refer  to  increasing  iterations  of  the  discrete  map  fp.  For 
example  applying  fp  to  a  state  will  sometimes  be  called  mapping  forwards  in  time  and  applying 
will  be  referred  to  as  mapping  backwards  in  time. 
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the  form  P(iCo,Po|y"’)-  From  (5.36)  we  have  that; 

log[P(xo,Po|y")] 

=  log  P{po)  +  log[P(xolpo,y”)] 

=  log  p{po)  -  i  E(4,(io)  -  vifR;'  (4.(^o)  -  Vi) 


(5.43) 


where  K2  is  a  constant. 

Information  about  probability  densities  is  obtained  by  sampling  in  state  and  parame¬ 
ter  space.  For  a  MAP  estimator,  we  expect  that  the  relative  merit  of  various  parameters 
samples,  po,  would  be  evaluated  according  to  the  formula; 

LiPoly"")  =  sup  log[P(xo,po|?/")] 

=  log  P(po)  +  sup  log[P(xo|po,  p”)] 

Xq^Ix 

=  K2  +  logP(po)  -  \  sup  {X](/po(^o)  -  Vi)^ R7^ {fpoi^o)  -  Pi)}- 

^  Xa&Ix  ,=:0 

In  general,  however,  we  will  only  consider  a  few  sets  of  observations  in  the  sequence, 
[yi].  For  example,  suppose  that  for  any  integer,  n  >  0,  the  linear  filter  has  identified 
kin)  groups  or  stages  of  measurements  that  may  be  sensitive  to  parameters.  Then  for 
each  i  €  {1,2, ... ,  A:(n)},  define  =  {yi\i  G  5,}  to  be  a  set  of  sensitive  measurements 
that  have  been  singled  by  the  linear  filter,  where  the  sets,  Sj  C  Z,  represent  the  indices 
that  can  be  used  to  identify  the  measurements.  From  our  arguments  m  chapters  3 
and  4  we  expect  that  most  of  the  information  about  the  parameters  of  the  system  can 
be  extracted  locally  by  looking  at  each  group  of  measurements  individually.  Thus  we 
consider  the  statistic,  Pfc(n)(Po),  as  a  replacement  for  L(polp")  where; 

k(n) 

Lk(n){po)  =  K2  +  log P(po)  +  S  log[F(a;o,Po|>})] 

xo€Yj 

=  K4{k{n))  -1-  log  P(po)  -  Si  ^"P  {  S(-^po(^o)  ”  2/0^^r^(/po(®o)  “  2/*)^] 

^  j=i  i^Sj 

and  K4{k{n))  depends  only  on  k{n). 

As  in  the  one-dimensional  case,  we  eliminate  parameter  samples,  p,  that  fail  to 
satisfy  a  thresholding  condition:  Pfc(n)(p)  >  supp,gpj^^^^{Pfc(n)(p')}  ~  cr  >  0 

where  Vk(n)  is  the  set  of  parameter  samples  at  stage  k{n).  In  practice,  if  Ij  for  j  € 
{1,2...,  k{n)}  are  really  the  main  measurements  sampling  parameter-sensitive  areas  of 
local  folding,  then  Lk(n}{po)  iu  fact  mirrors  L(po|p"),  at  least  with  respect  to  eliminating 
parameter  values  that  are  not  favored.  This  is  the  most  important  property  of  Lk(n){Po) 
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with  respect  to  parameter  estimation,  since,  as  in  the  one-dimensional  case,  we  would 
like  to  choose  the  parameter  estimate,  to  reflect  the  extremum  of  the  surviving 
parameter  range  where  L(pob”)  drops  off  rapidly. 

Stages 

Suppose  that  the  linear  filter  decides  that  the  data,  {?/,•},  might  be  sensitive  near  iter¬ 
ate  i  —  Nk-  Given  parameter  sample,  po,  we  begin  to  examine  the  density,  P{xni^  |po,  y[Nk— 
n,Nk  +  re]),  for  increasing  values  of  re  by  generating  trajectory  samples  of  the  form 
{fpoi^Nj}i:=-n  and  evaluating: 

log(P(;r„Jp„,!,[iV,  -n,Nt  +  n])]  =  ^  E  -  Vi) 

^  i=—n 

for  some  constant,  K.  As  in  the  one-dimensional  case,  for  each  re  we  keep  only  trajectory 
samples,  that  satisfy  a  thresholding  condition  like; 

log[P(a;;v*:  bo,  y[Nk  -  re,  -f  n])] 

>  sup  {log[P(a:;v*bo,  y[Nk  -  re,  +  n])]}  -  (5.44) 

for  some  u  >  0.  As  re  is  increased,  we  replace  trajectory  samples  that  have  been  thrown 
out  for  failing  to  satisfy  (5.44)  by  trying  new  initial  conditions  chosen  at  random  from 
a  bounded  region  in  state  space  which  we  will  denote  5o(po,  A’fc,  re).  Bo{po,  Nk,  n)  C  M 
plays  a  role  analogous  to  Uk{po,Nk+i)  in  the  one-dimensional  case,  except  that  it  is  a 
multidimensional  neighborhood  instead  of  simply  an  interval. 

Representing  sample  regions 

Given  a  specific  parameter  sample,  po,  we  now  discuss  how  to  choose  trajectory 
samples.  In  particular  we  examine  the  proper  choice  of  Bo{po,Nk,n)  for  re  >  0.  For 
any  re  >  0,  the  objective  is  to  choose  Bo{po,Nk,n)  so  that  it  is  a  reasonably  efficient 
representation  of  the  volume  of  space  occupied  by  Alo(po,  Nk,  re)  where  Xo{po,  Nk,n)  C  M 
is  a  bounded  region  in  state  space  such  that  x  G  Ao(po,  Nk,  re)  satisfies  (5.44).  We  want  to 
choose  a  simple  representation  for  Bo{po,  Nk,  re)  so  that  Bo{po,  Nk,  re)  is  large  enough  that 
Bo{po,  Nk,  re)  D  Xo(po,  Nk,  re),  but  small  enough  so  that  if  an  initial  condition  x  is  chosen 
at  random  from  Bo{po,Nk,n)  then  there  is  high  probability  that  x  G  Xo{po,  Nk,n).  We 
get  an  idea  for  what  Xo{po,  Nk,n)  is  by  iterating  old  trajectory  samples  of  the  density, 
P{xNk\po,y[Nk  —  (re  —  1),  A^a:  +  (re  —  1)]),  and  deleting  the  initial  conditions  that  do  not 
satisfy  (5.44).  Based  on  these  trajectory  samples,  we  choose  Bo{po,  Nk,  re)  to  be  a  simple 
parallelepiped  enclosing  the  surviving  initial  conditions.  As  new  trajectory  samples  are 
chosen  by  picking  random  initial  conditions  in  Bo{po,  Nk,n),  we  get  a  better  idea  about 
the  geometry  of  Xo{po,  Nk,n)  and  can  in  turn  choose  a  more  efficient  Bo{po,  Nk,n)  to 
generate  additional  trajectory  samples. 

In  our  implementation  of  the  algorithm,  Bo{po,  Nk,n)  is  always  represented  as  a 
box.  This  method  has  the  advantage  that  it  is  extremely  simple  and  also  makes  it 
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very  easy  to  generate  a  random  initial  condition  within  the  region,  Bo[po.,  One 

could  also  use  more  sophisticated  approximations  for  Bo{po,Nk,n).  However,  no  matter 
what  representation  we  use  for  Boipo,Nk,n),  we  are  likely  to  have  trouble  ^ft^r  a  while 
choosing  new  initial  conditions  and  iterating  new  sample  trajectories  to  satisfy  (5.  ). 

Dividing  sample  regions 

There  are  two  main  reasons  why  the  default  choice  of  Bo{po,Nk,n)  as  described 
above  can  cause  problems.  First,  just  as  in  the  one-dimensional  case,  high  probability 
density  areas  in  state  space  can  split  apart  into  separate  regions.  For  example,  m 
figure  5.4  we  see  that  regions  A  and  B  converge  towards  each  other  both  forwards  and 
backwards  in  time  (i.e.,  under  the  action  of  both  /p  and  /”^).  Both  regions  include 
orbits  that  shadow  {yi}?=N^-n  for  large  values  of  n.  Note  that  this  sort  of  phenomenon 
is  particularly  likely  to  happen  near  areas  of  folding,  which  are  the  areas  we  are  mos 
interested  in  investigating.  This  situation  is  a  problem  because  if  we  attempt  to  choose 
Boipo,  Nk,  n)  to  be  a  large  region  enclosing  both  A  and  B,  then  there  is  low  probability 
that  an  initial  condition  chosen  at  random  from  Bo{po,Nk,n)  will  satisfy  (5.44).  The 
solution  to  this  problem,  however,  is  not  too  difficult.  As  in  one-dimensional  case  we 
simply  choose  Xo{pQ,Nk,n)  to  be  whichever  region,  A  or  B,  has  the  highest  density 
values  and  concentrate  on  sampling  that  region. 


Figure  5.4:  Here  we  illustrate  why  there  can  be  multiple  regions  shadowing  the  same  orbit. 
Near  areas  of  folding,  two  regions,  A  and  B,  can  be  separate,  yet  can  get  asymptotically 
mapped  toward  each  other  both  forwards  and  backwards  in  time.  Note  that  in  the  picture,  A 
and  B  are  located  at  intersections  of  the  same  stable  and  unstable  mamfolds.  This  situation 
must  be  dealt  with  when  sampling  probability  densities  and  searching  for  optimal  shadowing 

orbits. 

Avoiding  degenerate  sample  regions 
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The  other  problem  is  that  Xo{pQ,Nk,n)  tends  to  collapse  onto  a  lower  dimensional 
surface  as  n  gets  large.  This  is  due  to  the  fact  that  the  map,  ,  generally  contracts  and 
expands  some  directions  in  state  space  more  than  others.  Our  ability  to  compute  orbits 
like  is  related  to  the  largest  expansion  factor  of  either  /p”o  /po" 

square  root  of  Dfp^(x)'^Dfp^(x)).  If  Xq(po,  Nk.,n)  collapses  onto  a  lower  dimensional 
surface,  that  means  that  across  the  width  of  the  surface  of  Xo{po,  Nk,n),  tiny  differences 
in  initial  conditions  get  magnified  to  the  level  of  the  measurement  noise  by  either  /"^ 
or  /^".  For  example,  if  is  responsible  for  collapsing  Xo{po,Nk,n)  onto  a  surface 
with  thickness  comparable  to  the  machine  precision,  then  we  cannot  expect  to  choose 
trajectory  samples  of  the  form  (x)  for  i  >  n  without  experiencing  debilitating  roundoff 
errors. 

Ideally,  as  n  increases,  we  would  like  Xo{po,Nk,n)  to  converge  toward  smaller  and 
smaller  ball-shaped  regions  while  maintaining  approximately  the  same  thickness  in  every 
direction.  Besides  having  better  numerical  behavior  than  regions  that  collapse  onto  a 
lower-dimensional  surface,  it  is  also  much  easier  to  represent  such  regions  and  choose 
initial  conditions  inside  these  regions. 

There  is  a  degree  of  freedom  that  is  available  and  can  be  used  to  adjust  the  shape 
of  the  region  where  initial  conditions  are  sampled.  We  can  simply  choose  to  iterate 
trajectory  samples  further  backwards  in  time  than  forwards  in  time  or  vice-versa.  In 
other  words,  if  expands  one  direction  much  more  than  f~J^  expands  any  direction  in 
state  space  then  we  may  iterate  orbits  of  the  form  where  Ua  >  rib.  The 

relative  sizes  of  and  rib  can  then  be  adjusted  to  match  the  rates  of  convergence  of  the 
region  where  initial  conditions  are  sampled. 

In  practice  it  can  be  a  bit  tedious  to  adjust  the  number  of  iterates  in  sample  trajec¬ 
tories  and  attempt  to  figure  out  what  effect  iterating  forwards  or  backwards  has  on  the 
shape  of  a  particular  region  in  state  space.  A  better  way  to  approach  the  problem  is  to 
examine  regions  of  the  form; 

^i(Po,  A'A,n)  =  f^^{Xo{po,  Nk,n)) 

for  j  G  {-n,-n-f  1,  —  For  any  particular  po,  Ajt,  and  n,  ii  Xo{po,  Nk,n)  starts 

to  become  an  inadequate  region  for  choosing  new  sample  trajectories,  we  simply  search 
for  a  j  so  that  the  region,  Xj[po,  Nk,  n),  is  not  degenerate  in  any  direction  in  state  space 
(This  process  is  described  in  the  next  section).  We  can  then  pick  new  initial  conditions, 

X  €  Xj{po,Nk,n)  and  iterate  orbits  of  the  form  in  order  to  evaluate  the 

proper  densities.  Note  that  instead  of  deleting  sample  trajectories  according  to  (5.44), 
new  sample  trajectories  are  now  thrown  out  if  they  fail  to  satisfy 

log[P(a:iv^_j Ipo,  y[Nk  -n,Nk  +  n])]  >  sup  {log[P(a:jv*-j |po,  y[Nk  -n,Nk  +  n])]}  - 

This  procedure  is  thus  equivalent  to  sampling  trajectories  from  Xo(po,Nk,n),  except 
that  it  is  better  numerically. 


89 


Evaluating  and  choosing  new  sample  regions 

We  now  describe  how  to  decide  when  an  initial  condition  sample  region  like  Xj^  {po,  Nk,  n) 
has  become  inadequate  and  how  to  choose  a  new  j*  G  {-n,  — n  + 1,  ■  •  • ,  n  —  1,  n}  so  that 
Xj->{po,Nk,n)  makes  an  effective  sample  region. 

Basically,  as  long  as  we  can  pick  Bo{po,Nkin)  so  that  most  initial  conditions,  x, 
chosen  from  Bo{po,Nk,n)  satisfy  a:  G  Xj,{poyNk,n),  then  things  are  satisfactory,  and 
there  is  no  need  to  search  for  a  new  sample  region.  However,  suppose  that  it  becomes 
difficult  to  choose  x  G  Bo{po,  Nk,n)  so  that  x  G  Xjf^{po,Nk,n).  It  might  be  the  case 
that  Xj,{po,  Nk,  n)  is  collapsing  in  multiple  directions,  and  we  simply  cannot  increase  n 
without  running  into  numerical  problems.  If  this  is  not  the  case,  then  we  first  search 
for  whether  Xj^ipo,  Nk,n)  can  be  divided  into  two  separate  high  density  regions.  If  so, 
then  we  concentrate  on  one  of  these  regions.  Otherwise  we  have  to  search  for  a  new 
j*  e  {-n,  -n  +  1, . . . ,  n  -  1,  n}  and  a  new  sample  region,  Xj*(po,  Nk,  n). 

This  is  done  in  the  following  manner.  We  take  the  trajectory  samples  marking 
the  region,  Xj^ipo,  Nk,n),  and  iterate  them  forwards  and  backwards  in  time  looking  at 
samples  of 

Xj{po,Nk,n)  =  fr°{X^oiPo.Nk,n)) 

for  j  G  {-n  +  jo,-n  +  jo  +  l,...,n  +  jo}.  We  would  like  to  pick  j*  to  be  a  value  for 
j  such  that  Xj{po,Nk,n)  is  not  degenerate,  so  that  it  is  easy  to  pick  Bo{po,Nk,n)  such 
that  X  G  Bo{po,Nk,n)  implies  x  G  Xj{po,Nk,n)  with  high  probability. 

We  would  also  like  to  pick  j*  so  that  Xj>{po,Nk,n)  is  a  well  balanced  region  and 
is  not  degenerate  in  any  direction.  The  first  thing  to  check  is  to  simply  generate  the 
box,  Bj{po,  Nk,  n),  enclosing  Xj{po,  Nk,  n)  for  each  j  and  make  sure  that  none  of  its  side 
lengths  are  degenerate.  This  condition  is  not  adequate,  however,  since  one  could  end 
up  with  a  j*  in  which  Xj*{po,Nk,n)  is  actually  long  and  thin  but  curls  back  on  itself 
so  that  its  bounding  box,  Bj{po,Nk,n),  is  not  long  and  thin.  In  order  to  check  for  this 
case,  one  thing  to  do  is  to  partition  the  box,  Bj{po,Nk,n),  into  a  number  of  subregions 
and  check  to  see  how  many  of  these  subregions  are  actually  occupied  by  the  trajectory 
samples  demarking  Xj{po,Nk,n).  If  very  few  subregions  are  occupied  then  we  have  to 
reject  j  as  a  possible  choice  for  j*.  An  adequate  choice  for  j*  can  then  be  made  using  this 
constraint  along  with  information  about  the  ratio  of  the  side  lengths  of  Bj{po,  Nk,n). 


Chapter  6 

Numerical  results 


In  this  chapter  we  present  results  from  various  numerical  experiments.  In  particular,  we 
demonstrate  the  elfectiveness  of  the  algorithms  proposed  in  chapter  5  for  estimating  the 
parameters  of  chaotic  systems. 

The  algorithms  are  applied  to  four  different  systems.  The  first  system,  the  quadratic 
map,  is  the  same  one-dimensional  system  that  was  examined  in  chapter  3  of  this  report. 
The  second  system  we  look  at  is  the  Henon  map,  a  dissipative  two-dimensional  mapping 
with  a  strange  attractor.  The  third  system  is  the  standard  map,  an  area-preserving  map 
that  exhibits  chaotic  behavior.  Finally  in  contrast  to  the  first  three  systems,  which  are 
all  nonuniformly  hyperbolic,  we  also  take  a  brief  look  at  the  Lozi  map,  one  of  the  few 
nonpathological  examples  of  a  chaotic  map  exhibiting  uniformly  hyperbolic  behavior. 

We  find  that  with  the  exception  of  the  Lozi  map,  the  other  maps  in  this  chapter 
all  exhibit  asymmetrical  shadowing  behavior  on  the  parameter  space  of  the  map.  Fur¬ 
thermore,  this  asymmetrical  behavior  always  seems  to  favor  one  direction  in  parameter 
space  regardless  of  locality  in  state  space. 

Note  that  many  of  the  basic  comments  and  explanations  applicable  to  all  the  systems 
are  included  in  section  6.1  on  the  quadratic  map,  where  the  issues  are  first  encountered. 


6.1  Quadratic  map 


In  this  section  we  describe  numerical  experiments  on  the  quadratic  map: 


fp{x)  =  px{l  -  x)  (6.1) 

where  x  G  [0, 1]  and  p  G  [0,4].  For  values  of  p  between  3.57  and  4.00,  numerical  exper¬ 
iments  suggest  that  there  are  a  large  number  of  parameter  values  where  (6.1)  exhibits 
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chaotic  behavior.  In  particular  we  will  concentrate  on  parameters  near  po  =  3.9.  For 
po  =  3.9,  numerical  results  indicate  that  fp^  has  a  Lyapunov  exponent  of  about  0.49. 

Let  us  begin  by  presenting  a  summary  of  our  results  for  one  particular  orbit  of  the 
quadratic  map,  the  orbit  with  initial  condition  xq  =  0.4.  These  results  are  summarized 
in  figure  6.1.  Our  discussion  in  this  section  will  seek  to  answer  the  following  questions: 
(1)  what  each  of  the  lines  in  figure  6.1  mean,  (2)  why  each  of  the  data  sets  graphed  has 
the  behavior  shown,  and  (3)  what  we  expect  the  asymptotic  behavior  for  each  of  the 
traces  might  be  if  the  simulations  were  continued  for  higher  numbers  of  data  points. 


Figure  6.1:  This  graph  summarizes  results  related  to  estimating  the  parameter  p  in  the 
quadratic  map  for  data  generated  using  the  initial  condition  Xq  =  0.4. 
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6.1.1  Setting  up  the  experiment 

In  order  to  test  parameter  estimation  algorithms  numerically,  we  first  pick  a  parameter 
value,  po  and  generate  a  sequence  of  data  points  to  represent  noisy  measurements 

of  /po .  This  is  done  by  choosing  an  initial  condition,  xq,  and  numerically  iterating  the 
orbit  {xi  =  /p(,(a:o)}f=o-  The  noisy  measurements,  {yJJLo,  are  then  simulated  by  setting 
Vi  =  Xi  +  Vi  where  the  v^s  are  randomly  generated  values  for  i  £  {0,1,..., n}.  For 
the  experiments  in  this  section,  the  Uj’s  are  chosen  to  simulate  independent  identically 
distributed  Gaussian  random  variables  with  standard  deviation  0.001. 

We  then  use  the  simulated  data,  as  input  to  the  parameter  estimation  al¬ 

gorithm  to  see  whether  the  algorithm  can  figure  out  what  parameter  value  was  used  to 
generate  the  data  in  the  first  place.  In  general  the  parameter  estimation  algorithm  may 
also  use  a  priori  information  like  an  initial  parameter  estimate  along  with  some  measure 
of  how  good  that  estimate  is.  In  this  chapter  we  generally  choose  the  initial  parameter 
estimate  to  be  a  random  value  within  .025  of  po- 


6.1.2  Kalman  filter 

Let  us  now  examine  what  happens  when  we  apply  the  square  root  extended  Kalman 
filter  to  the  quadratic  map.  We  investigate  the  Kalman  filter  for  data  generated  from 
four  different  initial  conditions:  xq  —  {0.1, 0.2, 0.3, 0.4}. 

Figure  6.2  illustrates  perhaps  the  most  important  feature  of  the  simulations,  namely 
that  the  Kalman  filter  eventually  “diverges.”  Each  trace  in  figure  6.2  represents  the 
average  of  ten  different  runs  using  ten  different  sets  of  numerically  generated  data  from 
each  initial  condition.  On  the  p— axis  we  plot  the  ratio  of  the  actual  error  of  the  pa¬ 
rameter  estimate  versus  the  estimated  mean  square  error  obtained  from  the  covariance 
matrix  of  the  filter.  If  the  filter  is  working,  we  generally  expect  this  ratio  to  be  close 
to  1.  Note  also  that  the  filter  seems  to  start  fine,  but  then  the  error  jumps  to  many 
“standard  deviations”  of  the  expected  error  and  never  returns  to  the  normal  operating 
range. 

In  fairness,  plotting  an  average  can  be  somewhat  misleading  because  the  average 
might  be  skewed  by  outliers  and  runs  that  fail  massively.  There  are  in  fact  significant 
differences  from  run  to  run.  However,  numerous  experiments  with  the  Kalman  filter 
suggest  that  divergence  pretty  much  always  occurs  if  one  allows  the  filter  to  run  long 
enough.  In  addition,  none  of  the  standard  techniques  for  addressing  divergence  difficul¬ 
ties  seem  to  be  able  to  adequately  solve  the  problem  (eg,  exponential  forgetting  of  data). 
It  seems  that  one  is  stuck  with  either  letting  the  filter  diverge,  or  somehow  decreasing 
confidence  in  the  covariance  matrix  so  much  that  accurate  estimates  cannot  be  attained. 

In  figure  6.3  we  plot  the  actual  error  of  the  Kalman  filter  versus  number  of  state 
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Figure  6.2;  This  figure  shows  results  for  applying  the  square  root  extended  Kalman  filter  to 
estimating  the  parameters  of  the  quadratic  map  with  p  =  3.9.  Each  trace  represents  the  average 
ratio  of  the  actual  parameter  estimate  error  to  the  estimated  mean  square  error  as  calculated 
by  the  Kalman  filter  over  10  different  trials.  The  different  traces  represent  experiments  based 
on  orbits  with  different  initial  conditions.  Note  how  the  error  jumps  up  to  levels  on  the  order 
of  10  or  higher,  indicating  divergence. 
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samples  used  on  a  log-log  scale.  Again  the  errors  plotted  are  the  average  of  the  errors  of 
ten  different  runs.  We  see  that  the  error  makes  progress  for  a  little  while  but  then  diver¬ 
gence  occurs.  The  Kalman  filter  rarely  makes  any  real  progress  after  divergence  occurs, 
not  even  exhibiting  the  improvement  characteristic  of  purely  stochastic  convergence 
(ie,  the  filter  is  not  getting  any  information  from  the  dynamics),  since  the  over-confident 
covariance  matrix  prohibits  the  parameter  estimate  from  moving  much  unless  the  state 
data  drifts  many  deviations  away  from  what  the  filter  expects.  ^ 


6.1.3  Analysis  of  proposed  algorithm 

We  now  examine  the  performance  of  the  algorithm  presented  in  section  5.5.  The  results 
in  this  section  reflect  an  implementation  of  the  algorithm  based  on  9  samples  in  param¬ 
eter  space  and  50  samples  in  state  space  (250  when  representations  for  different  stages 
are  being  combined).  Each  stage  is  iterated  until  the  state  sample  region  is  of  length 
1  X  10“®  or  less.  We  use  cr  =  8  so  that  the  sample  spaces  in  state  and  parameters  are  8 
deviations  wide. 

One  of  the  most  striking  things  about  the  results  of  the  algorithm  is  the  asymmetry 
of  the  merit  function,  T(p),  in  parameter  space.  As  shown  in  figure  6.4,  the  parameter 
merit  function  typically  shows  a  very  sharp  dropoff  on  the  low  end  of  the  parameter 
space.  Based  on  this  asymmetry  we  choose  the  parameter  estimate  to  be  the  parameter 
value  at  which  the  sharp  dropoff  in  L{p)  occurs. 

In  figure  6.5  we  see  the  performance  of  the  algorithm  on  data  based  on  the  initial 
conditions,  xo  G  {0.1, 0.2, 0.3, 0.4}.  Each  trace  in  the  figure  represents  one  run  of  the 
algorithm.  Rerunning  the  algorithm  multiple  times  on  data  based  on  the  same  initial 
condition  produces  similar  results,  except  that  the  scanning  linear  filter  sometimes  defers 
a  few  more  or  less  points  to  the  Monte  Carlo  estimator  for  analysis. 

Note  how  the  error  in  the  estimate  tends  to  converge  in  sudden  large  jumps  over  small 
numbers  of  iterates,  while  staying  approximately  constant  in  between  these  jumps.  The 
large  decreases  in  error  level  occur  when  the  data  orbit  makes  a  close  approach  to  the 
turning  point,  causing  a  stretch  of  state  samples  to  become  sensitive  to  parameters. 
This  is  not  simply  a  product  of  discretization  in  the  algorithm,  since  the  Monte  Carlo 
estimator  sometimes  makes  no  gains  at  all,  while  other  times  great  gains  are  made,  and 
a  large  number  of  parameter  samples  are  deleted  on  the  lower  end  of  the  parameter 
sample  range. 

One  might  wonder  how  this  graph  would  look  like  if  we  were  to  extend  it  for  arbitrarily 
many  iterates.  Consider  the  theory  presented  in  chapter  3.  First  of  all,  it  is  likely 

^Interestingly,  this  actually  does  occur,  apparently  near  areas  of  folding,  since  the  filter  models  the 
folding  phenomena  so  poorly.  Occasionally  this  can  even  cause  the  filter  to  get  back  in  sync,  moving 
the  parameter  estimate  just  the  right  amount  to  lower  the  error.  This  seems  to  be  quite  rare,  however. 
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Figure  6.3:  Graph  of  the  average  error  in  the  parameter  estimate  as  computed  by  square 
root  extended  Kalman  filter  applied  to  the  quadratic  map  with  parameter  value  p  =  3.9.  Data 
represents  average  error  over  10  runs. 


p-pO,  parameter  error 

Figure  6.4:  Asymmetry  in  the  parameter  space  of  the  quadratic  map:  Here  we  graph  the 
parameter  merit  function  L(p)  after  processing  2500  iterates  of  an  orbit  with  initial  condition 
Xo  =  0.4.  The  merit  function  is  normalized  so  that  L{p)  =  0  at  the  maximum.  Since  a  =  8, 
a  parameter  sample,  p,  is  deleted  if  L{p)  <  —64.  This  sort  of  asymmetrical  merit  function  is 
typical  of  all  orbits  encountered  in  the  quadratic  map,  Henon  map,  and  standard  map. 

that  /po  satisfies  the  linking  condition,  and  therefore  exhibits  a  parameter  shadowing 
property.  This  means  there  is  essentially  an  end  to  the  progress  that  can  be  made  in 
the  estimate  based  on  dynamical  information,  after  which  stochastic  convergence  would 
be  the  rule.  However,  there  is  evidence  that  the  level  of  accuracy  at  which  this  effect 
becomes  important  is  probably  many,  many  orders  of  magnitude  smaller  from  the  level 
we  are  dealing  with.  ^ 

This  leads  us  to  ask:  assuming  that  we  do  not  see  the  effects  of  parameter  shadowing, 
how  does  the  parameter  estimation  accuracy  converge  with  respect  to  n,  the  number  of 
state  samples  processed  by  the  algorithm?  As  conjectured  in  section  3.5,  we  believe  that 
the  accuracy  converges  at  a  rate  proportional  to  A  line  with  a  slope  of  -2  is  drawn 
in  figure  6.5  to  suggest  the  conjectured  asymptotic  behavior.  Note  that  the  conjecture 
seems  plausible  from  the  picture,  although  more  data  would  be  needed  to  really  make 
the  evidence  convincing. 

In  figure  6.6  we  show  the  error  in  the  upper  bound  of  the  parameter  range  being  con¬ 
sidered  by  the  algorithm.  While  the  lower  bound  of  this  range  is  used  as  the  parameter 
estimate,  the  upper  bound  has  significantly  different  behavior.  After  an  initial  period, 
the  convergence  of  the  upper  bound  is  governed  purely  by  stochastic  means  (ie,  without 
any  help  from  the  dynamics).  This  is  predicted  by  Theorem  3.4.2.  Thus  we  expect  that 

^It  is  difficult  to  calculate  this  directly,  since  it  requires  knowing  the  exact  number  of  iterates  it  takes 
an  orbit  from  the  turning  point  to  return  near  the  turning  point.  However,  rough  calculations  suggest 
that  for  most  parameters  around  po  =  3.9  we  expect  that  parameter  shadowing  would  not  be  seen  until 
parameter  deviations  are  less  than  1  x  10"®°  for  noise  levels  of  0.001. 
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Figure  6.5:  Graph  of  the  actual  error  in  the  parameter  estimate  of  the  proposed  algorithm 
when  applied  to  data  from  the  quadratic  map  with  p  =  3.9.  A  line  of  slope  -2  is  drawn  on  the 
graph  to  indicate  the  conjectured  asymptotic  rate  of  convergence  for  the  estimate. 
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the  convergence  will  be  on  the  order  of  as  suggested  by  the  line  with  a  slope  of  ^ 
as  shown  in  the  figure.  The  small  jumps  in  the  graphs  for  figure  6.6  are  simply  the  result 
of  the  discrete  nature  of  how  parameter  space  is  sampled. 


6.1.4  Measurement  noise 

One  other  important  question  to  ask  is,  what  happens  if  we  change  the  level  of  mea¬ 
surement  noise?  The  short  answer  is  that  the  parameter  estimate  results  presented  here 
are  surprisingly  insensitive  to  measurement  noise.  If  we  ignore  the  parameter  shadow¬ 
ing  effects  caused  by  close  returns  to  the  turning  point  (which  we  have  already  argued 
are  negligible  for  our  experiments),  then  shadowing  of  any  finite  orbit  is  really  an  all 
or  nothing  property  in  parameter  space.  Consider  a  stretch  of  state  orbit  with  initial 
condition  xq  close  to  the  turning  point.  Then  for  a  parameter  value  in  the  unfavored 
direction,  either  the  parameter  value  can  shadow  that  stretch  of  orbit  (presumably  with 
initial  condition  closer  to  the  turning  point  than  xo),  or  the  parameter  value  cannot 
shadow  the  orbit,  in  which  case  it  loses  track  of  the  original  orbit  exponentially  fast. 
Asymptotically,  the  measurement  noise  actually  makes  no  difference  in  the  parameter 
estimate  other  than  through  parameter  shadowing  effects  caused  by  linking.  Thus,  once 
the  measurement  noise  is  lower  than  a  certain  level,  the  actual  measurement  noise  makes 
very  little  difference  in  the  accuracy  of  parameter  estimates. 

Measurement  noise  does  have  a  large  affect  on  figure  6.6,  the  upper  parameter  bound, 
and  the  possibility  of  parameter  shadowing  caused  by  linking.  If  the  measurement  noise 
is  large,  then  there  is  likely  to  be  more  parameter  shadowing  effects  caused  by  linking.  On 
the  other  hand,  if  the  measurement  noise  is  really  small,  then  the  asymmetrical  effect  in 
parameter  space  will  in  fact  get  drowned  out  for  quite  a  while  (until  the  sampled  orbit 
comes  extremely  close  to  the  turning  point).  In  most  reasonable  cases  however,  the 
asymmetry  in  parameter  space  is  likely  to  be  quite  important  if  we  want  to  get  accurate 
parameter  estimates  for  reasonably  large  data  sets. 


6.2  Henon  map 

We  now  discuss  numerical  experiments  with  the  Henon  map: 

Xn+l  =  J/n  +  1  -  (^-2) 

yn+\  =  bXn  (6-3) 

where  the  state  (a;„,  Pn)  €  and  the  parameter  values,  a  and  b,  are  invariant.  For 
parameter  values  a  =  1.4  and  b  =  0.3,  numerical  evidence  indicates  the  existence  of  a 
chaotic  attractor  as  shown  in  figure  6.7.  See  Henon  [27]  for  a  more  detailed  description 
of  the  basic  properties  of  Henon  map. 
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Figure  6.7:  The  Henon  attractor  for  a  =  1.4,  6  =  0.3, 

For  the  purposes  of  testing  out  parameter  estimation  algorithms,  we  fix  6  =  0.3  and 
attempt  to  estimate  the  parameter,  a.  State  data  is  chosen  from  an  orbit  on  the  attractor 
of  the  Henon  map.  Noisy  measurement  data  is  generated  using  a  state  orbit  and  adding 
Gaussian  noise  with  standard  deviation  0,001  to  each  state  value. 

Applying  the  square  root  extended  Kalman  filter  to  an  orbit  on  the  attractor  results 
in  figure  6.8,  Observe  that  the  filter  diverges  after  about  15,000  iterates  and  does  not 
recover.  Note  that  the  figure  represents  data  for  only  one  run.  However,  the  results  in 
figure  6.8  are  representative  for  other  sequences  of  data  that  we  have  tried.  Although 
the  performance  of  the  Kalman  filter  is  quite  sensitive  to  noise,  the  key  point  is  that 
divergence  inevitably  occurs,  sooner  or  later,  and  the  performance  of  the  filter  is  generally 
unreliable. 

Note  in  figure  6.8  that  the  expected  mean  square  error  of  the  Kalman  filter  tends  to 
change  suddenly  in  jumps.  In  most  cases  these  jumps  probably  correspond  to  sections 
of  orbits  that  are  especially  sensitive  to  parameters  because  of  folding  in  state  space. 
The  Kalman  filter  has  a  tough  time  handling  the  folding  and  typically  divergence  occurs 
during  one  of  these  jumps  in  the  mean  square  error.  This  phenomenon  is  also  apparent 
in  figure  6.12.  Note  also  that  even  after  divergence,  the  parameter  estimate  sometimes 
changes  by  many  standard  deviations,  indicating  that  the  state  space  error  residual  must 
have  been  many  deviations  off.  This  again  reflects  the  fact  that  the  Kalman  filter  does 
not  model  folding  well. 

We  now  apply  the  algorithm  described  in  section  5.6.  We  choose  to  examine  the  top- 
level  scan  filter  every  20  iterates  or  so  looking  for  covariance  matrix  drops  of  around  a 
factor  of  .7  or  less.  The  algorithm  is  relatively  insensitive  to  changes  in  these  parameters 
so  their  choice  is  not  particularly  critical. 
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Figure  6.8:  This  graph  depicts  the  performance  of  the  Kalman  filter  in  estimating  parameter 
a  for  one  sequence  of  noisy  state  data  from  the  Henon  map  for  a  =  1.4  and  6  =  0.3.  The  data 
was  generated  using  the  initial  condition,  (xo,t/o)  =  (.633135448, 18940634),  which  is  very  close 
to  the  attractor. 
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Figure  6.9:  Asymmetry  in  tiie  parameter  space  of  the  Henon  map  (with  a  =  1.4,  b  =  0.3): 
Here  we  graph  the  parameter  merit  function  L{a)  after  200000  iterates  of  an  orbit  with  initial 
condition  on  the  attractor  near  Xq  =  (.423,  .208).  Note  that  this  merit  function  is  actually 
based  on  only  the  most  sensitive  931  data  points,  since  the  linear  filter  threw  out  over  199,000 
points. 

As  in  the  quadratic  map,  we  find  that  the  parameter  merit  function,  L{a),  is  asym¬ 
metrical  in  parameter  space.  Specifically,  L{a)  always  has  a  sharp  dropoff  in  its  lower 
bound,  indicating  that  the  Henon  map  favors  higher  parameters  for  parameter  a  (see 
figure  6.9).  This  property  seems  to  be  true  for  any  orbit  on  the  attractor.  It  also  seems 
to  be  true  for  all  the  parameter  values  of  the  Henon  that  have  been  tried.  We  thus  take 
advantage  of  the  asymmetry  in  parameter  space  in  order  to  estimate  the  parameters  of 
the  system. 

Figure  6.10  shows  the  estimation  effort  for  data  generated  from  several  different 
initial  conditions  on  the  attractor.  The  tick  marks  on  the  traces  of  the  graph  denote 
places  where  the  top  level  scan  filter  deferred  to  the  Monte-Carlo  analysis.  Note  that 
as  with  the  quadratic  map,  improvements  in  the  estimate  seem  to  be  made  suddenly. 
Because  relatively  few  numbers  of  points  are  analyzed  by  the  Monte-Carlo  technique, 
and  because  the  state  samples  scanned  by  the  Kalman  filter  do  not  contribute  to  the 
parameter  estimate,  almost  all  the  gain  in  parameter  estimate  must  have  been  made 
because  of  the  dynamics. 
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Figure  6.10;  Graph  of  the  actual  error  of  the  parameter  estimate  for  a  using  the  proposed 
algorithm  on  the  Henon  map  (with  a  =  1.4  and  6  =  0.3).  This  graph  contains  results  for 
four  different  sets  of  data  corresponding  to  four  different  initial  conditions,  aU  chosen  on  the 
attractor  of  the  system.  The  tick  marks  on  each  trace  denote  places  where  the  top  level  Kalman 
filter  deferred  to  a  Monte-Carlo-based  approach  for  additional  analysis. 
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6.3  Standard  map 

We  now  discuss  numerical  experiments  with  the  standard  map: 


Xn+i  =  {xn  +  yn  +  K  sin  x)  mod  27r  (6.4) 

Vn+I  =  {yn  +  K  sin  x)  mod  27r  (6.5) 

where  K  is  the  parameter  of  the  system  and  the  state,  (a:„,j/„)  G  T^,  lives  on  the  2- 
torus,  T^.  The  standard  map  is  a  Hamiltonian  (area-preserving)  system,  and  thus  does 
not  have  any  attractors.  Instead,  for  example,  for  A'  =  1,  there  is  apparently  a  mixture 
of  invariant  tori  and  seas  of  chaos  where  non-periodic  orbits  wander  around.  This  is 
illustrated  in  figure  6.11.  See  Chirikov  [13]  for  more  discussion  on  the  properties  of  the 
standard  map. 


Figure  6.11:  This  picture  shows  various  orbits  of  the  standard  map  near  K  =  Note  that 
since  the  space  is  a  torus,  the  sides  of  the  square  are  actually  overlapping.  This  picture  shows 
a  number  of  different  orbits.  Some  orbits  fill  out  dark  zones  of  chaotic  behavior,  while  others 
remain  on  circular  tori. 

In  order  to  test  the  parameter  estimation  technique,  we  picked  K  =  1  and  generated 
data  based  on  orbits  chosen  to  be  in  a  chaotic  region.  To  each  state,  we  added  random 
Gaussian  measurement  noise  with  standard  deviation  0.001  to  produce  the  data  set.  The 
results  of  applying  the  square  root  extended  Kalman  filter  are  shown  in  figure  6.12.  As 
in  the  quadratic  map  and  Henon  map,  we  see  that  the  Kalman  filter  diverges. 

In  figure  6.14  we  show  the  result  of  applying  the  algorithm  in  section  5.6  to  the 
standard  map.  In  particular  we  investigate  data  for  five  different  initial  conditions  in 
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Figure  6.12:  Tliis  graph  depicts  the  performance  of  the  square  root  extended  Kalman  filter 
for  estimating  parameter  K  using  one  sequence  of  noisy  state  data  from  the  standard  map 
with  K  =  1.  The  data  was  generated  using  the  initial  condition,  (a;o,2/o)  =  (0.05,0.05).  This 
initial  condition  results  in  a  trajectory  that  wanders  around  in  a  chaotic  zone. 
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Figure  6.13:  Asymmetry  in  the  parameter  space  of  the  standard  map  (with  K  =  1):  Here 
we  graph  the  parameter  merit  function  L(K)  after  250000  iterates  of  an  orbit  with  initial 
condition  Xq  =  (.423,  .208). 

the  chaotic  zone.  In  figure  6.13  we  see  the  effects  of  asymmetric  shadowing  in  the 
standard  map.  The  algorithm  used  in  these  trials  is  exactly  the  same  as  the  one  used  for 
the  experiments  with  the  Henon  map  (not  even  the  tunable  parameters  of  the  algorithm 
were  changed).  This  indicates  that  the  algorithm  is  relatively  fiexible  and  does  not  have 
to  be  tuned  precisely  to  generate  reasonable  results. 


6.4  Lozi  map 

We  now  discuss  numerical  experiments  with  the  Lozi  map: 

a:n+i  =  y„  +  l-o|x„|  (6.6) 

yn+i  =  bxn  (6.7) 

where  the  state  {xn,yn)  €  and  the  parameter  values,  a  and  b,  are  invariant.  The 
Lozi  map  may  be  thought  of  as  a  piecewise  linear  version  of  the  Henon  map.  Unlike 
the  Henon  map,  however,  the  Lozi  map  is  uniformly  hyperbolic  where  the  appropriate 
derivatives  exist  ([36]).  For  parameter  values  a  =  1.7  and  b  =  0.5,  the  Lozi  map  has  a 
hyperbolic  attractor  ([36])  as  shown  in  figure  6.15. 

For  the  purposes  of  testing  out  parameter  estimation  algorithms,  we  fix  6  =  0.5  and 
attempt  to  estimate  a.  State  data  is  chosen  from  an  orbit  on  the  attractor  of  the  Lozi 
map. 

In  figure  6.16  we  show  the  result  of  applying  a  square  root  extended  Kalman  filter 
to  the  Lozi  map.  Unlike  with  the  quadratic,  Henon,  and  standard  maps,  the  Kalman 
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standard  Map:  Error  in  parameter  estimate,  tog  scate 


Figure  6.14:  This  graph  depicts  the  performance  of  the  proposed  algorithm  for  estimating 
parameter  K  using  one  sequence  of  noisy  state  data  from  the  standard  map  with  K  —  1. 


Figure  6.15:  The  Lozi  attractor  for  a  =  1.7,  6  =  0.5. 


filter  applied  to  the  Lozi  map  shows  no  signs  of  divergence,  at  least  within  100,000 
iterates.  Note  that  the  convergence  of  the  expected  mean  square  parameter  estimation 
error  falls  almost  exactly  at  the  rate  indicated  by  pure  stochastic  convergence.  Thus, 
the  dynamics  makes  no  asymptotic  contribution  to  the  parameter  estimate,  as  one  would 
expect  with  a  uniformly  hyperbolic  system. 

We  cannot  really  apply  the  algorithm  from  section  5.6  to  the  Lozi  map  because  there 
are  basically  no  sensitive  orbit  sections  to  investigate.  The  whole  data  set  would  pass 
right  through  the  top  level  scanning  filter  without  further  review.  However,  even  if  we 
did  force  the  Monte-Carlo  algorithm  to  consider  all  the  data  points,  we  should  again 
find  purely  stochastic  convergence. 


Figure  6.16:  This  graph  plots  the  performance  of  a  square  root  extended  Kalman  filter  in 
estimating  the  parameter,  o,  in  the  uniformly  hyperbolic  Lozi  map.  The  data  here  represents 
the  average  over  five  runs  based  on  data  with  different  measurement  noises  bit  generated 
using  the  parameters  a  =  1.7,  6  =  0.5,  and  the  same  initial  condition  on  the  attractor,  near 
(xo,yo)  =  (-.407,  .430).  Note  the  lack  of  divergence,  and  the  fact  that  convergence  is  purely 
stochastic. 
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Chapter  7 


Conclusions  and  future  work 


7.1  Conclusions 

This  report  examines  how  to  estimate  the  parameters  of  a  chaotic  system  given  obser¬ 
vations  of  the  state  behavior  of  the  system.  This  problem  is  interesting  in  light  of  recent 
efforts  to  use  chaotic  systems  for  control  and  signal  processing  applications,  and  because 
of  the  possibilities  for  using  parameter  estimation  in  chaotic  systems  to  develop  ex¬ 
tremely  sensitive  measurement  techniques.  In  order  to  evaluate  the  possible  application 
of  parameter  estimation  techniques  to  chaotic  systems,  we  approached  this  report  with 
two  main  goals  in  mind:  (1)  to  examine  the  extent  to  which  it  is  theoretically  possible 
to  estimate  the  parameters  of  a  chaotic  system,  and  (2)  to  develop  an  algorithm  to  do 
the  parameter  estimation.  Significant  progress  was  made  on  both  objectives. 

7.1.1  Theoretical  considerations 

In  order  to  examine  the  theoretical  possibilities  of  parameter  estimation,  we  first  broke 
chaotic  systems  down  into  two  categories:  structurally  stable  systems  and  systems  that 
are  not  structurally  stable.  Structurally  stable  systems  are  probably  not  that  interesting 
for  measurement  applications,  since  small  perturbations  in  the  parameters  of  these  sys¬ 
tems  do  not  result  in  qualitatively  different  state  orbits.  Consequently,  we  cannot  extract 
asymptotic  information  about  the  parameters  by  observing  the  dynamics  of  structurally 
stable  systems. 

The  situation,  however,  is  significantly  different  for  systems  that  are  not  structurally 
stable.  It  turns  out  that  the  accuracy  of  parameter  estimates  is  closely  related  to  how 
orbits  shadow  each  other  for  systems  with  slightly  different  parameter  values.  Thus, 
investigating  the  possibilities  for  parameter  estimation  required  us  to  examine  shadowing 
orbits.  We  discovered  two  interesting  properties  of  shadowing  orbits  for  parameterized 
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families  of  nonuniformly  hyperbolic  systems.  First,  we  found  that  there  is  often  an 
asymmetrical  shadowing  behavior  in  the  parameter  space  of  these  systems.  That  is,  for 
one-parameter  families  of  systems,  it  is  typically  much  easier  for  systems  with  slightly 
higher  parameter  values  to  shadow  orbits  of  systems  with  slightly  lower  parameter  values 
(or  vice  versa).  To  illustrate  this  property  in  at  least  one  case,  we  proved  a  specific 
shadowing  result  showing  there  truly  is  a  preferred  direction  in  parameter  space  for 
certain  maps  of  the  interval  with  negative  Schwarzian  derivative  satisfying  a  Collet- 
Eckmann-like  condition  for  state  and  parameter  space  derivatives. 

In  addition,  we  also  found  that  given  a  typical  orbit  of  a  nonuniformly  hyperbolic  sys¬ 
tem,  most  iterates  of  the  orbit  look  locally  hyperbolic,  so  that  only  a  few  rare  stretches 
of  the  orbit  are  sensitive  to  parameters  and  exhibit  the  asymmetrical  shadowing  behav¬ 
ior  in  parameter  space.  These  sensitive  stretches  of  orbit  seem  to  correspond  to  local 
nonhyperbolic  folding  behavior  in  state  space. 


7.1.2  Parameter  estimation  algorithms 

In  designing  the  new  parameter  estimation  algorithm,  we  took  advantage  of  the  two 
theoretical  observations  described  above.  First,  since  most  of  the  state  data  is  apparently 
insensitive  to  parameter  changes,  we  chose  a  fast  top-level  filter  to  scan  through  the 
data  before  concentrating  on  data  that  might  be  especially  sensitive.  The  observation 
about  asymmetrical  shadowing  behavior  in  parameter  space  is  also  extremely  important, 
since  it  means  that  we  have  only  to  investigate  the  sharp  boundary  in  parameter  space 
between  parameters  that  do  and  do  not  shadow  the  data  in  order  to  estimate  what  the 

true  parameters  are. 

The  resulting  algorithm  is  shown  to  perform  significantly  better  than  standard  pa¬ 
rameter  estimation  algorithms  like  the  extended  Kalman  filter.  The  extended  Kalma,n 
filter  typically  diverges  for  most  problems  involving  parameter  estimation  of  chaotic 
systems.  That  is,  the  filter’s  covariance  matrix  becomes  too  confident  about  the  es¬ 
timation  error,  effectively  fixing  the  parameter  estimate  to  an  incorrect  value  without 
accepting  new  information  from  additional  data  points.  This  occurs  because  most  o 
the  information  about  the  parameters  of  the  system  can  be  derived  from  observations 
that  experience  local  folding  in  state  space,  a  phenomenon  that  is  inherently  difficult  to 
model  with  the  local  linearization  techniques  used  by  the  Kalman  filter. 

Our  algorithm,  on  the  other  hand,  does  not  have  the  divergence  problem  of  the 
extended  Kalman  filter.  In  several  numerical  experiments  we  demonstrated  that  the 
algorithm  described  in  this  report  achieved  accuracies  at  least  3  to  4  orders  of  magnitude 
better  than  the  extended  Kalman  filter  before  the  experiment  was  stopped.  Presumably, 
we  should  be  able  to  get  even  better  accuracies  with  the  proposed  algorithm  simply  by 
using  more  data  points.  Meanwhile,  the  divergence  problem  places  a  fairly  strict  bound 
on  the  accuracy  of  the  extended  Kalman  filter. 
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Furthermore,  it  appears  that  the  estimation  accuracy  of  the  proposed  algorithm 
converges  at  a  rate  of  ^  for  certain  systems  (where  n  is  the  number  of  state  samples 
processed).  This  is  interesting  because  it  is  significantly  better  than  the  stochastic 
convergence  one  might  typically  expect  from  most  nonchaotic  or  structurally  stable 
systems.  This  indicates  that  the  chaotic  dynamics  of  a  system  can  indeed  help  parameter 
estimation  to  some  extent,  and  opens  the  door  to  some  interesting  possible  applications 
like  high  precision  measurement. 


7.2  Future  work 

7.2.1  Theory 

Many  questions  still  remain  unanswered.  First  of  all,  I  would  like  to  know  how  to  really 
characterize  the  ability  of  a  system  to  shadow  other  systems.  Is  there  a  simple  set 
of  properties  of  a  parameterized  family  of  mappings  that  guarantee  the  asymmetry  in 
parameter  space  shadowing  behavior  for  a  large  class  of  mappings?  How  widespread 
is  this  asymmetrical  behavior  in  parameter  space  shadowing?  It  seems  likely  that  the 
situation  is  “generic”  in  some  sense,  but  how  can  we  make  this  statement  more  concrete? 

Shadowing  is  particularly  not  well  understood  in  higher  dimensional  systems.  It 
might  be  helpful  to  further  investigate  the  invariant  manifolds  of  nonuniformly  hyper¬ 
bolic  systems  in  order  to  better  understand  shadowing  results.  In  particular,  it  would 
be  interesting  to  investigate  more  quantitative  results  concerning  the  folding  behavior 
observed  in  this  report  and  to  specify  how  this  phenomenon  affects  shadowing  behavior 
in  general. 

There  is  also  work  to  be  done  in  figuring  out  exactly  what  the  rate  of  convergence 
is  likely  to  be  for  parameter  estimation  algorithms,  in  particular  when  those  algorithms 
are  applied  to  multi-dimensional  nonuniformly  hyperbolic  systems.  This  is  important  if 
we  would  like  to  choose  a  system  to  optimize  for  parameter  sensitivity.  The  conjectures 
of  section  3.5  seem  to  be  a  good  place  to  start. 

7.2.2  Parameter  estimation  algorithms 

There  are  a  number  of  ways  in  which  the  parameter  estimation  algorithm  could  probably 
be  improved.  For  instance,  the  biggest  problem  now  seems  to  be  in  the  behavior  of  the 
top-level  scanning  Kalman  filter.  Is  there  a  better  way  of  detecting  where  the  parameter- 
sensitive  stretches  of  data  occur?  Perhaps  a  better  solution  would  be  to  use  some  sort 
of  fixed-lag  smoother  so  that  data  is  taken  from  both  forwards  and  backwards  in  time 
in  order  to  smooth  out  local  stretches  of  parameter-sensitive  data. 
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Also,  is  there  a  nicer  way  of  representing  the  state-parameter  space  probability  den¬ 
sities?  It  is  clear  that  linear  representations  like  those  in  the  extended  Kalman  filter 
cannot  do  the  job.  I  have  tried  a  number  of  other  representation  forms  without  success, 
and  eventually  resorted  to  a  Monte-Carlo  based  method.  Perhaps  a  more  efficient  but 
still  effective  representation  form  for  the  densities  can  be  found. 


7.2.3  Applications 

Most  importantly,  there  are  still  questions  about  how  to  apply  parameter  estimation 
in  chaotic  time  series  to  problems  like  high  precision  measurement,  control,  or  other 
possible  applications.  This  report  shows  that  many  chaotic  systems  exhibit  some  special 
properties  that  would  aid  someone  who  is  interested  in  knowing  the  parameters  of  a 
system  based  on  state  data.  Now  that  we  have  a  better  theoretical  base  for  understand¬ 
ing  what  factors  affect  parameter  estimation  in  chaotic  systems,  it  should  be  easier  to 
understand  how  and  when  to  apply  the  resulting  algorithmic  tools. 

As  for  the  possibility  of  high  precision  measurement  applications,  this  idea  certainly 
merits  additional  research  in  light  of  the  results  in  this  report.  The  main  problem  here 
would  be  to  find  a  suitable  application  where  the  quantity  to  be  measured  is  physically 
interesting  and  the  chaotic  system  involved  satisfies  all  the  right  properties.  For  instance, 
this  technique  would  ideally  be  applied  to  a  system  that  is  well-modeled  by  a  relatively 
simple  set  of  equations.  The  problem  would  be  to  find  a  suitable  setup  that  would 
make  the  application  worthwhile,  and/ or  to  increase  the  sophistication  of  the  parameter 
estimation  algorithms  to  handle  a  larger  set  of  experimental  situations. 
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Appendix  A 

Proofs  from  Chapter  2 


This  appendix  contains  notes  on  three  proofs  from  Chapter  2.  Note  that  in  the  first  two 
theorems  (sections  A.l  and  A. 2),  we  reverse  the  names  of  the  functions  /  and  g  from 
the  corresponding  theorems  in  the  text  of  this  report.  This  is  done  to  conform  with  the 
notation  used  in  Walters’  paper,  [62].  The  notation  in  the  appendix  is  the  same  as  in 
Walters,  while  the  notation  in  the  text  is  switched. 

A.l  Proof  of  Theorem  2.2.3 

Theorem  2.2.3:  (Walters)  Let  f  :  M  M  be  an  expansive  diffeomorphism  with  the 
pseudo-orbit  shadowing  property.  Suppose  there  exists  a  neighborhood,  V  C  DifF^(Af)  of 
f  that  is  uniformly  expansive.  Then  f  is  structurally  stable. 

Proof:  This  is  based  on  theorem  4  and  5  and  the  remark  on  page  237  in  [62].  In  theorem 
4,  Walters  states  that  an  expansive  homeomorphism  with  the  pseudo-orbit  shadowing 
property  is  ’’topologically  stable.”  However,  Walters’  definition  of  topological  stability  is 
weaker  than  our  definition  of  structural  stability.  In  particular,  for  topological  stability 
of  /,  Walters  requires  that  there  exist  a  neighborhood,  U  C  Diff^{M),  of  /  such  that  for 
each  g  £  U,  there  is  a  continuous  map  h  :  M  M  such  that  hg  =  fh.  For  structural 
stability,  this  h  must  be  a  homeomorphism.  We  can  get  the  injectiveness  of  h  from 
the  uniform  expansiveness  of  nearby  maps  (apply  theorem  5  of  [62]).  We  can  get  the 
surjectiveness  of  h  from  the  compactness  of  M  based  on  an  argument  from  algebraic 
topology  (see  Lemma  3.11  in  [38],  page  36).  Since  M  is  compact,  and  h  is  injective  and 
surjective,  h  must  be  a  homeomorphism. 
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A. 2  Proof  of  Theorem  2.2.4 


Theorem  2.2.4:  Let  f  :  M  ^  M  be  an  expansive  diffeomorphism  with  the  function 
shadowing  property.  Suppose  there  exists  a  neighborhood,  V  C  Diff  ^(M)  of  f  such  that 
V  is  uniformly  expansive.  Then  f  is  structurally  stable. 

Proof:  The  proof  given  here  is  similar  to  theorem  4  of  [62]  except  that  the  effective  roles 
of  /  and  g  are  reversed  (where  g  denotes  maps  near  /  in  Diff^{M)).  Instead  of  knowing 
that  all  orbits  of  nearby  systems  can  be  shadowed  by  real  orbits  of  /  (pseudo-orbit 
shadowing),  here  we  are  given  that  all  orbits  of  /  can  be  shadowed  by  real  orbits  of  any 
nearby  system  (function  shadowing). 

We  shall  prove  that  there  is  a  neighborhood  1/  C  V  of  /  in  Diff^{M)  such  that  for 
any  g  E  U,  there  exists  a  continuous  h  such  that  hf  =  gh  (note  that  the  h  we  use  here 
is  the  inverse  of  the  one  in  theorem  2.2.3).  From  this  result  we  can  use  the  arguments 
outlined  for  theorem  2.2.3  to  show  that  h  is  a.  homeomorphism  because  of  the  uniform 
expansiveness  of  /  and  the  compactness  of  M. 

First  we  need  to  show  the  existence  of  a  function  h  :  M  M  such  that  hf  =  gh. 
From  the  function  shadowing  property,  given  any  e  >  0,  there  exists  a  neighborhood, 
U^CV  of  f  such  that  any  orbit  of  /  is  e-shadowed  by  an  orbit  of  g 

Now  suppose  that  e  ^  inf^gv  this  case,  we  claim  that  there  is  exactly 

one  orbit  of  g  that  e-shadows  any  particular  orbit  of  /.  If  this  were  not  true  then  two 
different  orbits  of  g,  and  {j/n}?  niust  shadow  the  same  orbit  of  /.  But  because  of 
the  expansiveness  of  g  there  must  exist  an  integer,  N,  such  that  d(a:jvj  J/Af))  >  2e,  so  that 
{xn}  and  {?/„}  clearly  cannot  e-shadow  the  same  orbit  of  /.  Thus  we  can  see  that  there 
must  be  a  function  h  which  maps  each  orbit  of  /  to  a  shadowing  orbit  of  g. 

Consequently,  for  any  e  >  0,  there  exists  a  neighborhood  Uc  such  that  for  any  g  E  Uc, 
we  can  define  a  function  h  such  that  hf  =  gh  and: 

supxeMd{h{x),x)  <  e.  (A.l) 

We  now  need  to  show  that  this  h  is  also  continuous.  To  do  this  we  first  need  the  following 
lemma  from  [62]: 

Lemma  A.2.1  (Lemma  2  in  [62])  Let  f  be  expansive  with  expansive  constant  e{f)  >  0. 
Given  any  8  >  f),  there  exists  N  >  1  such  that  d(/"(a:),/”(y))  <  e(/)  for  |n|  <  N 
implies  d{x,  y)  <  6. 

Proof  of  Lemma'.  Given  ^  >  0,  suppose  that  the  lemma  is  not  true  so  that  no  such 
N  can  be  chosen.  Then  there  are  exists  a  sequence  of  points,  and  (not 

orbits),  such  that  for  any  N  >  I,  d{xN,yN)  >  8  and  d(/”(a:7v), /"'(j/a’))  <  e(/)  for  all 
|n|  <  N.  There  exists  a  subsequence  of  points  {xni}^o  such  that  Xm  x 
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and  Hrn  —>  y  as  i  —>■  oo  such  that  d{x,y)  >  8.  By  continuity  of  /  this  implies  that 
d(/”(a;), /”(?/))  <  e(/)  for  all  n,  which  is  a  direct  contradiction  of  the  expansiveness  of 
/.  This  completes  the  proof  of  lemma  A. 2.1. 

Returning  to  the  proof  of  theorem  2.2.4,  we  now  want  to  show  the  continuity  of  h.  In 
other  words,  given  any  a  >  0  we  need  to  show  there  exists  a  ^  >  0  such  that  d(a:,  y)  <  8 
implies  d(h{x),  h{y))  <  a. 

Our  strategy  is  as  follows:  Since  g  is  expansive,  from  lemma  A. 2.1  we  know  that 
for  any  a  >  0  we  can  choose  such  that  if  d(g”'{h{x)),g'^(h(y)))  <  e(g)  for  |nj  <  Na 
then  d{h{x),h(y))  <  a.  Thus  suppose  that  for  any  a  >  0  there  exists  ^  >  0  such  that 
d{x,y)  <  8  implies  d{g'^{h{x)),g‘^(h[y)))  <  e(5f)  for  all  |n|  <  Then  d{h[x),h{y))  <  cx, 
and  h  must  be  continuous.  This  is  what  we  shall  show. 

Given  a  >  0,  pick  d  >  0  such  that  d{f”'{x),  f'^{y))  <  8  if  |ra|  <  Na-  Set  e{V)  = 
supg^ye{g)  and  fix  e  =  \e(V).  From  equation  (  A.l)  we  know  that  given  this  e  >  0, 
there  exists  a  neighborhood,  Uc  C  V,  of  f  in  Diff^(M)  such  that  for  any  g  E 
there  exists  h  such  that  hf  =  gh  and  supxsMd{h{x),x)  <  t.  Thus  for  any  g  E  and 
corresponding  h  :  M  M,if  d{x,y)  <  e  then  we  have: 

d{g-{h{x)),g^{h{y)))  =  d{h{nx)),h{S^{y))) 

<  d{h{nx)),  r{x)) + d(r (x),  ny))  +,  d{ny),  Kny))) 

<  e+^e(F)  +  e 

<  e{V)  <  e{g)  for  all  |n|  <  Na 

From  the  argument  in  the  previous  paragraph,  this  shows  that  h  must  be  continuous 
which  completes  the  proof  of  theorem  2.2.4. 


A. 3  Proof  of  Lemma  2.3.1 

Lemma  2.3.1:  Suppose  that  fp  E  DifF^(M)  for  p  E  Ip  CR,  and  let  /(x,p)  =  fp{x)  for 
any  x  E  M.  Suppose  also  that  f  is  and  that  fp^  is  an  absolutely  structurally  stable 
diffeomorphism  for  some  po  E  Ip-  Then  there  exists  Cq  >  0  and  K  >  0  such  that  for  every 
positive  t  <  Co,  o,ny  orbit  of  fp^  can  be  e— shadowed  by  an  orbit  of  fp  for  p  E  B{po,  Ke). 

Proof:  This  follows  from  the  definition  of  absolute  structural  stability.  From  that  def¬ 
inition,  we  know  that  there  exists  cq  >  0,  >  0,  and  conjugating  homeomorphisms, 

hp  :  M  M,  such  that  if  p  €  B{po,  cq),  then: 

sup  d{h-^{x),x)  <  Ki  sup  d{fpg{x),  fp{x))). 
x€M  x^M 

where  fp^  =  hpfph~^.  Given  an  orbit,  {x„},  of  fp^  we  claim  that  h~^  maps  Xn  onto  a 
suitable  shadowing  orbit,  Zn{p)  of  fp  for  each  n  G  Z.  Also,  since  /  is  for  (x,p)  E  Mxlp, 
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there  exists  a  constant,  K2  >  0,  such  that  supxeMd{fpo{x),  fp{x))  <  K2\p  Po|  for  any 
p  e  Ip.  Thus,  setting  z„(p)  =  h~'^{xn),  for  all  n  we  see  that: 

svipd{zn{p)tXn)  <  sup  d{h~^ {x) ,  x) 
nes  xeM 

<  Kisupd{fp,{x),fp{x)) 

x^M 

<  KiK2\p-po\ 

for  all  integer  n.  Now  setting  K  =  2Kii^'’  the  desired  result  that  supnsz^{^n{p) j  Xn)  < 

e  if  p  e  B{po,Ke),  for  all  n  and  any  positive  e  <  co.  This  completes  the  proof  of 
lemma  2.3.1. 
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Appendix  B 

Proof  of  theorem  3.2.1 

In  this  appendix,  we  present  the  proof  for  theorem  3.2.1. 


B.l  Preliminaries 

We  first  repeat  the  related  definitions  which  are  the  same  as  those  found  in  chapter  3. 
Throughout  this  appendix  we  shall  assume  that  /  C  R  represents  a  compact  interval  of 
the  real  line. 

Definitions:  Suppose  that  /  :  /  -^  /  is  continuous.  Then  the  turning  points  of  /  are 
the  local  extrema  of  /  in  the  interior  I.  C{f)  is  used  to  designate  the  set  of  all  turning 
points  of  /  on  /.  C  (/,  /)  is  the  set  of  continuous  maps  on  /  such  that  /  6  O'  (/,  I)  if: 

(a)  /  is  (for  r  >  0) 

(b)  /(/)  C  /,  and 

(c)  f{Bd{I))  C  Bd{I)  (where  Bd{I)  denotes  the  boundary  of  I). 

If  /  G  C (/,  I)  and  y  G  C (/,  /),  let  d{f,g)  =  sup^^j  \f{x)  -  g{x)\. 

Definitions:  A  continuous  map  f  :  I  ^  I  \s  said  to  be  piecewise  monotone  if  /  have 
finitely  many  turning  points.  /  is  said  to  be  a  uniformly  piecewise-linear  mappings  if  it 
can  be  written  in  the  form: 

f{x)  —  ai  ±  sx  for  x,-  G  [ci_i,  Cj]  (B.l) 

where  s>l,co<ci<...  <  Cq  and  ^  >  0  is  an  integer.  (We  assume  s  >  1  because 
otherwise  there  will  not  be  any  interesting  behavior). 

Note  that  for  this  section,  it  is  useful  to  define  neighborhoods,  P(x,e),  so  that  they 
do  not  extend  beyond  the  confines  of  I.  In  other  words,  let  B{x,  e)  =  (x  —  e,  x  +  e)  fl  /. 
With  this  in  mind,  we  use  the  following  definitions  to  describe  some  relevant  properties 
of  piecewise  monotone  maps. 
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Definition:  A  piecewise  monotone  map,  /  :  /  — ^  is  said  to  be  transitive  if  for  any 

two  open  sets  U,V  C  I,  there  exists  an  ra  >  0  such  that  /"(f/)  D  V  7^  0. 

Definitions:  Let  /  :  /  — ^  /  be  piecewise  monotone.  Then  /  satisfies  the  linking 
property  if  for  every  c  €  and  any  e  >  0  there  is  a  point  z  &  I  such  that  z  €  B[c^  e), 
p{z)  €  C{f)  for  some  integer  n  >  0,  and  |f  (c)  -  fiz)\  <  e  for  every  i  6  {1,2,  • 
Suppose,  in  addition,  that  we  can  always  pick  2  /  c  such  that  the  above  condition  is 
satisfied.  Then  /  is  said  to  satisfy  the  strong-linking  condition. 

We  are  now  ready  to  state  the  objective  of  this  appendix: 

Theorem  3.2.1  Transitive  piecewise  monotone  maps  satisfy  the  function  shadowing 
property  in  I)  if  and  only  if  the  satisfy  the  strong  linking  property. 

We  note  Liang  Chen  [12]  proves  a  similar  result,  namely  that  the  pseudo-orbit  shad¬ 
owing  property  is  equivalent  to  the  linking  property  for  maps  topologically  conjugate 
to  uniformly  piecewise  linear  mappings.  Some  parts  of  the  proof  we  describe  below 
are  also  similar  to  the  work  of  Coven,  Kan,  and  Yorke  [17]  for  tent  maps  (uniformly 
piecewise  linear  maps  with  one  turning  point).  The  main  difference  is  that  they  prove 
a  pseudo-orbit  shadowing  property  while  we  are  interested  in  parameter  and  function 
shadowing. 


B.2  Proof 


This  section  will  be  devoted  to  the  proof  of  theorem  3.2.1  and  related  results.  The  basic 
strategy  of  the  proof  will  be  as  follows.  First  we  relate  piecewise  monotone  mappings  to 
piecewise  linear  mappings  through  a  topological  conjugacy  (lemmas  B.2.1  and  B.2. 2). 
This  provides  for  uniform  hyperbolicity  away  from  the  turning  points.  Second  we  capture 
the  effects  of  “folding”  near  turning  points  and  show  how  this  leads  to  function  shadowing 
(lemmas  B.2.4,B.2.5,B.2.6).  Finally  in  lemma  B.2.7  we  show  that  the  local  folding  effects 
of  lemmas  B.2.4,  B.2.5,  or  B.2.6  are  satisfied  for  the  maps  we  are  interested  in. 

Lemma  B.2.1  ;  Let  f  :  I  I  he  a  transitive  piecewise-monotone  mapping.  Then  f  is 
topologically  conjugate  to  uniformly  piecewise-linear  mapping. 

Proof:  See  Parry  [51]  and  Coven  and  Mulvey  [18]. 

The  following  lemma  is  necessary  for  the  application  of  the  topological  conjugacy 
result. 

Lemma  B.2. 2  Let  f  :  I  I  and  g  :  I  I  he  two  topologically  conjugate  continuous 
maps.  If  f  has  the  linking  or  strong  linking  property  then  g  must  have  these  properties 
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also.  If  f  satisfies  has  the  function  shadowing  property  on  C^(/, /),  then  g  must  also 
satisfy  the  function  shadowing  property  on  C’(/,  I). 

Proof:  Since  /  and  g  are  conjugate,  the  orbits  of  /  and  g  are  connected  through  a 
homeomorphism,  /i,  such  that  g  =  h~^fh.  Because  h  is  continuous  and  one-to-one,  the 
of  turning  points  of  /  and  g  must  be  preserved  by  the  topological  conjugacy.  Thus  if  / 
has  the  linking  or  strong  linking  properties,  then  g  must  have  these  properties  also. 

Now  suppose  that  /  has  the  function  shadowing  property  on  ((?’(/,/).  We  want  to 
show  that  g  also  has  this  function  shadowing  property  which  means  that  for  any  e  >  0, 
there  exists  a  neighborhood,  V,  of  g  in  such  that  \i  g^  €.V  then  any  orbit  of  g 

is  e— shadowed  by  an  orbit  of  g^. 

Since  h  is  continuous,  and  I  is  compact,  we  know  that  given  e  >  0  there  exists  ^  >  0 
such  that  \x  —  y\  <  8  implies  \h(x)  —  h(y)\  <  e  if  x,y  £  I.  Given  this  ^  >  0,  since  /  has 
the  function  shadowing  property,  there  is  a  neighborhood  U  C  C^(/, /)  of  /  such  that 
if  /*  G  U,  then  any  orbit  of  /  can  be  ^-shadowed  by  an  orbit  of  /..  Let  V  =  h~^Uh. 
Since  g  =  h~^fh,  V  must  contain  a  neighborhood  of  g  in  (/,/).  We  now  must  show 
if  g*  €  V,  then  any  orbit  of  g  can  be  e— shadowed  by  an  orbit  of  p*. 

Suppose  we  are  given  an  orbit,  {a:„},  of  g  and  any  g^  €  V.  Let  {wn}  be  the  corre¬ 
sponding  orbit  of  /  such  that  =  h~^(xn).  Set  /,  =  h~^{g^).  Since  /*  €  U,  there  exists 
an  orbit,  {i/n},  of  /*  that  shadows  {tUn}-  Then  if  Zn  =  h{yn),  {zn}  must  be  an  orbit 
of  g^  that  e— shadows  {xn}-,  since  |/i(a:)  —  /i(r/)|  <  c  if  |a:  —  y|  <  ^.  This  proves  the  lemma. 

Thus,  combining  lemmas  B.2.1  and  B.2.2,  we  see  that  the  problem  of  proving  the 
function  shadowing  property  for  transitive  piecewise-monotone  maps  with  the  strong 
linking  property  reduces  to  proving  the  function  shadowing  property  for  uniformly  piece- 
wise  linear  maps  with  the  strong-linking  property. 

We  now  introduce  one  more  result  that  will  be  useful  later  on: 

Lemma  B.2.3  Let  f  :  I  I.  Suppose  /”  satisfies  the  function  shadowing  property  on 
^{I,  /)  for  some  integer  n  >  0.  Then  f  has  the  function  shadowing  property  on  I). 

Proof:  Given  any  e  >  0  we  need  to  show  that  there  exists  a  neighborhood,  U  of  /  in 
C’(/,  I)  such  that  if  g  ^  U,  then  any  orbit  of  /  is  e— shadowed  by  an  orbit  of  g.  Since  / 
is  continuous  and  I  is  compact,  there  exists  a  ^  >  0  such  that  if  |x  —  yj  <  ^,  then 

If  (^)  -  mi  <  5^  (B.2) 

for  any  i  G  {0, 1, ... , n}  and  x,y  ^  I.  We  also  know  that  there  exists  a  neighborhood, 
Vi  of  /  in  C’(/,  I)  such  that  if  g  ^V\  : 

\f{x)-g^{x)\<^e  ^  (B.3) 
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for  all  X  €  /  and  i  €  {0, 1, ,  n}. 

Combining  (B.2)  and  (B.3)  and  using  the  triangle  inequality  we  see  that  for  any 
e  >  0  there  exists  a  ^  >  0  and  a  neighborhood,  Vj,  of  /  in  €?*(/,  /)  such  that  if  ^  G  Vi 
and  \x-y\  <  S,  then; 

|/■(x)  -  s-(9)|  <  e.  (B.4) 

for  all  i  €  {0,l,...,ii)  if  a:,!/  e  /.  Given  £  >  0,  fix  i  >  0  and  Vj  6  €?(/,/)  to  satisfy 
(B.4). 

Using  this  ^  >  0,  since  /"  has  the  function  shadowing  property,  we  know  there  exists 
a  neighborhood,  V2,  of  in  <CP{IJ)  such  that  if  €  ^2,  then  any  orbit  of  /"  is 
shadowed  by  an  orbit  g'^.  Given  this  neighborhood,  V2,  of  /”,  we  can  always  pick  a 
neighborhood,  V3  C  C‘(/,  I)  of  /  such  that  g  e  V3  implies  that  g'^  G  V2.  This  is  apparent, 
since  for  any  a  >  0  there  exists  a  neighborhood  V3  of  /  in  such  that 

dir.g'")  =  sup  \r{x)  -  g^{x)\  <  a. 
xei 

iig  eU.  Thus,  for  any  e  >  0,  if  5  G  V3,  then  any  orbit  of  /”  is  (^-shadowed  by  an  orbit 
of  g'^. 

Now  set  U  =  V\C\Vz.  Note  that  U  must  be  a  contain  neighborhood  of  /  in  C^(/,  I). 
If  we  fix  5  G  U,  we  find  that  given  any  orbit,  of  /,  there  is  an  orbit,  of  g 

such  that  Vi  G  B{xu  6)  Hi  =  kn  for  any  A:  G  {0, 1, . . .  }.  Thus,  from  (B.4),  we  know  that 
yi  G  B{xi,  e)  for  all  i  >  0.  Consequently,  given  any  e  >  0,  there  exists  a  neighborhood  U 
of  /  in  I)  such  that  H  g  e  U,  then  any  orbit  of  /  can  be  e-shadowed  by  an  orbit 
of  g.  This  is  what  we  set  out  to  prove. 

We  now  examine  the  mechanism  underlying  shadowing  in  one- dimensional  maps.  In 
the  next  three  lemmas  we  look  at  how  local  “folding”  can  lead  to  shadowing. 

Lemma  B.2.4  Given  f  G  suppose  that  for  any  e  >  0  sufficiently  small  there 

exists  a  neighborhood,  U,  of  f  in  0’(/, /)  such  that  if  g 

g{B{x,e))D{B{fix),e))  (B.5) 

for  all  X  £  1.  Then  f  has  the  function  shadowing  property  in  C?’(/, /). 

Proof.  Let  {x„}  be  an  orbit  of  /  and  suppose  that  (B.5)  is  satisfied.  Then  if  G  U,  for 
any  y\  £  I  with  yx  £  B{xi,  e)  we  can  choose  a  t/o  €  /  so  that  yo  G  B{xo,  e)  and  j/i  =  g{yo)- 
Similarly  for  any  y2  e  I  with  y2  G  B{x2,  e),  we  can  pick  yx  and  yo  within  e  distance  of  Xx 
and  Xo,  respectively.  Extending  this  argument  for  arbitrarily  many  iterates  we  see  that 
(B.5)  implies  that  there  exists  an  orbit,  {t/,},  of  g  so  that  yi  £  B{xi,  e)  for  all  integer 
i  >  0.  Thus,  given  any  e  >  0  sufficiently  small,  there  exists  a  neighborhood,  U,  of  /  in 
0^(7, 1)  such  that  if  g  £U,  then  any  orbit  orbit  of  /  can  be  e-shadowed  by  an  orbit  of 

9- 
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Lemma  B.2.5  Let  f  €  C° (/,/).  Suppose  that  for  any  e  >  0  sufficiently  small,  there 
exists  N  >  0  and  a  neighborhood,  U,  of  f  in  €P(I,I)  such  that  for  any  g  €  U,  there 
exists  a  function  n  :  I  Z"*"  so  that  for  each  x  E  I : 

{«"<*'(!/):  |/■'W-^•'(!/)|<£,0<^■<n(x)}2(B[/”'">W,e])  (B.6) 


where  1  <  n(x)  <  N  for  all  x  E  I.  Then  f  has  the  function  shadowing  property  in 


Proof:  The  idea  is  very  similar  to  lemma  B.2.4.  Let  {a;„}  be  an  orbit  of  /.  In  lemma  B.2.4, 
given  sufficiently  small  e  >  0  and  g  E  U,  we  could  always  choose  yo  E  B{xo,  e)  given  a 
yi  E  B{xi,e)  so  that  yi  =  g{yo).  A  similar  thing  applies  here  except  that  we  have  to 
consider  the  iterates  in  groups.  Suppose  that  the  premise  of  lemma  B.2.5  is  satisfied. 
Given  sufficiently  small  e  >  Offix  g  E  U.  Then,  for  any  yn(xo)  €  B{xn{3;o)^  f^i^re  exists  a 
finite  orbit  Yq  =  of  g  such  that  \xi  —  yi\  <  e,  for  z  €  {0, 1, ... ,  n(a;o)}.  Similarly, 

we  can  play  the  same  trick  starting  with  yn{xo)  for  the  next  n{xn{xo))  group  of  iterates 

constructing  another  finite  orbit,  Yi  =  of  g.  Since  we  are  free  choose 

Yq  from  any  yn{xo)  €  B{xn(xo)i  h  is  clear  that  given  any  Yi  we  can  pick  a  Yq  belonging 
to  the  same  infinite  forward  orbit  of  g,  thereby  allowing  us  to  concatenate  Vo  and  Yi  to 

construct  a  single  finite  orbit  of  g,  e— shadows 

This  process  can  be  repeated  indefinitely  for  arbitrarily  many  groups  of  iterates,  gluing 

together  each  group  of  iterates  as  we  go.  Thus  the  function  shadowing  property  holds. 

Lemma  B.2.6  Let  f  E  C°(/, /).  Suppose  that  for  any  e  >  0  sufficiently  small,  there 
exists  N  >  Q  and  a  neighborhood,  U,  of  f  in  ^{1,1)  such  that  for  any  g  E  U,  there 
exists  a  function  n  :  I  ^  Z'*'  so  that  for  each  x  E  I  '■ 

:  |x -!/|  <  e,  |/’(i) -si’(!))|  <  8e,  1  <  •  <  n(i))  (B.7) 

where  1  <  n{x)  <  N  for  all  x  E  I-  Then  f  has  the  function  shadowing  property  in 

e(/,/). 


Proof:  (compare  with  lemma  2.4  of  [17]).  We  shall  show  that  given  sufficiently  small 
e  >  0  and  any  €  17,  if  (B.7)  is  satisfied,  then  for  any  orbit,  {x,}^q  of  /,  there  exists 
an  orbit,  of  9  such  that  \xi  —  t/j|  <  8e  for  all  integer  i  >  0.  By  condition  (B.7), 

given  any  2/°(^(,)+i  ^  ^])  we  can  choose  a  finite  orbit,  Yq  =  of  9  that 

8e-shadows  and  satisfies  9{yn{x))  —  yn(a;)+i*  Similarly,  using  the  same  trick  with 

the  next  n(xn{xo))  iterates,  we  can  construct  a  finite  orbit,  Y\  =  {?/i  ,  of  g 

that  8e-shadows  and  satisfies  E  B{xn{xo)-,^)- 
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Also,  notice  that  given  Yi  we  can  always  choose  a  Yo  so  that  giyn(xo))  -  yUxo)+i- 
This  is  because  we  know  that  €  B[xn{xo),e]  and  because  we  are  free  to  choose  any 

yl(xo)+i  ^  construct  Yq.  Consequently  we  can  concatenate  Yq  and  Yi  to 

form  an  orbit  that  8e-shadows  We  can  continue  this  construction  by 

concatenating  more  groups  of  n{xi)  iterates  for  increasingly  large  i.  Thus  given  (B.7)  it 
is  apparent  that  we  can  choose  an  orbit,  9  8e-shadows  any  orbit  of  /  if 

g  e  U.  This  proves  the  lemma. 

Now  we  must  show  that  lemma  B.2.6  is  satisfied  for  any  uniformly  piecewise-linear 
map.  Note  that  condition  (B.6)  in  lemma  B.2.5  in  fact  implies  (B.7)  in  lemma  B.2.6,  so 
it  is  sufficient  to  show  that  either  (B.6)  or  (B.7)  is  true  for  any  particular  x  e  I.  This 
is  done  in  lemma  B.2.7  below.  We  can  then  combine  lemma  B.2.7  with  lemma  B.2.3  to 
prove  theorem  3.2.1. 

First,  however,  we  introduce  the  following  notation,  in  order  to  state  our  results  more 
concisely. 

Definition:  Given  a  map,  /  €  0^(7,/),  define: 

Dk{x,g,e)  =  {g'‘{y):y€  I  Af{x)-g\y)\<e  for 

Ekix,g,e)  =  {g\y)  :  y  €  7,  jx  -  t/|  <  e,  and  !/’(a:)  -  y’(y)l  <  8e  for  i  €  {1,2, . . . ,  fc}}. 

for  any  X  G  7,  fc  €  Z+,  and  e  >  0  where  g  €  <C5'(7, 7)  is  a  perturbation  of  /.  Although 
Dk{x,g,e)  and  Ekix,g,t)  also  depend  on  /  we  leave  out  this  dependence  because  / 
will  always  refer  to  the  uniformly  piecewise  linear  map  specified  in  the  statement  of 
lemma  B.2.7  below. 

Lemma  B.2.7  :  Let  f  :  I  I  be  a  uniformly  piecewise  linear  map  with  slope  s  >  9. 
Suppose  that  f  satisfies  the  strong  linking  property.  Then  for  any  e  >  0  there  exists 
N  >  9  and  a  neighborhood,  U,  of  f  in  0^(7, 7)  such  that  for  any  g  £  U  at  least  one  of 
the  following  two  properties  hold  for  each  x  €  7  : 

(I)  Dn(x){x,g,e)D 

(II)  g{En(x){x,g,e))  D  g{B[r^^\x),t]) 

where  n  :  7  — ^  and  1  ^  n.(x)  <C  N  for  all  x  €  7. 

Proof  of  lemma  B.2.7:  Let  C{f)  =  {ci,C2, . . . ,  cj  where  Ci  <  cj  <  .  ■  •  <  c,.  Assume 
that  e  >  0  is  small  enough  such  that 

\ck  -Ci\>  16e 


for  any  k  ^  i. 
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We  now  utilize  the  strong  linking  property.  For  each  j  e  {1, 2, . . . ,  and  A;  e 
define  Wk{j,  e)  C  /  such  that; 

^k{j,  e)  =  {/(y)  --y  el,  \ricj)  -  f(y)\  <  for  i  e  {0, 1, . . . ,  k}}  (B.8) 

Given  e  >  0,  for  each  j  €  {1, 2, . . . ,  9}  let  rrij  be  the  minimum  k  such  that 

Wk(j,c)f]  (B.9) 

The  strong  linking  property  implies  that  such  m/s  exist  and  are  finite  for  each  j  € 
{1,2, ...,5}  and  for  any  e  >  0.  From  (B.8)  and  (B.9)  we  can  also  see  that  for  each 
j  €  {1, 2, . . . ,  5},  there  exists  some  r{j)  €  {1, 2, . . . ,  9}  such  that 

Cr(j)  e  Wk{j,e). 


Now  set: 

and  note  that  from  (B.8)  and  (B.9): 

m(Ci)-c.u)|<  je 

for  any  j  €  {1, 2, . . . ,  ^}.  Thus  it  is  evident  that: 

.  1 
4  < 

Because  of  the  strong  linking  property,  we  know  that  Sx  >  0. 

Also,  set  M  =  define  Aa;(5r)  -^R  such  that: 


^x{g)  =  .  inax  sup  \r{x)  -  g\x)\, 


(B.IO) 

(B.ll) 

(B.12) 

(B.13) 


and  choose  f/  to  be  a  neighborhood  of  /  in  C’(/,  I)  such  that  /!^x{g)  <  for  any  g  £  U. 
Thus  for  any  g  £  U,  any  x  £  I,  and  any  i  £  {1,2,...,  M}  : 

If  (a:) -/(x)|  <  ^e.  (B.14) 


Now,  let  (a;  6]  indicate  either  the  interval,  (a,  6],  or  the  interval,  [6,  a),  whichever  is 
appropriate.  Then,  since  s  >  9,  for  any  e  >  0  we  assert  that: 

Diicj,  /,  e)  =  (f  (cy)  -  <Ti(cj)e  ;  f  (cy)]  (B.15) 
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for  each  j  €  {1, 2, . . . ,  9}  and  every  ie  {1,2,...,  rrij}  where: 

J  -i-i  if  fi  has  a  a  relative  maximum  at  c  €  C{f) 

—  I  g  g  rclativc  mimmum  at  c  €  C{f). 

Note  that  (B.9)  guarantees  that  that  Diicj,  /,  e)nC{f)  =  0  for  any  i  6  {1, 2, ... ,  mj-1}. 
Thus,  since  s  >  9,  (B.15)  can  be  shown  by  a  simple  induction  on  i. 

We  now  proceed  to  the  main  part  of  the  proof  for  lemma  B.2.7: 

Given  any  t?  G  t/  we  must  show  that  for  each  a:  G  /  either  condition  (I)  or  (II)  holds 
in  the  statement  of  the  lemma  for  some  n(a;)  <  N.  We  now  break  up  the  problem  into 
two  separate  cases.  Given  some  e  >  0  first  suppose  that  x  is  more  than  e  distance  away 
from  any  turning  point.  In  other  words  suppose  that  \x-Cj\>e  for  all  j  G  {1, 2, . . . ,  q}. 
Then  we  can  set  n(a;)  =  1  and  it  is  easy  to  verify  that  condition  (I)  of  the  lemma  holds: 

Di{x,g,e)  =  giB(x,e))f]B{f{x),e) 

=  B{g{x),e) 

since  3  >  9  and  l/(x)  -  g{x)\  <  ;|  for  all  a:  G  /. 

The  other  possibility  is  that  x  is  within  e  distance  of  one  of  the  turning  points,  in 
other  words  that  x  G  V  where: 

V  =  {x  €  I  :\x  -  Cj\  <  efoT  j  G  {1,2,...,  9}}. 

Below  we  show  that  for  all  9  G  f/,  if  x  G  V  does  not  satisfy  condition  (I)  then  x  satisfies 
condition  (II)  of  the  lemma.  This  would  complete  the  proof  of  lemma  B.2.7. 

Suppose  that  |x— Cj|  <  e  for  some  j  G  {1,2, . . . ,  9}  and  suppose  that  x  does  not  satisfy 
condition  (I)  for  any  n(x)  G  {1,2,...,  mj}.  In  qualitative  terms,  since  /  is  expansive  by 
a  factor  of  s  >  9  everywhere  except  at  the  turning  points,  the  only  way  for  x  not  to 
satisfy  condition  (I)  is  if  x  is  close  enough  to  cj  so  that  Di(x,g,  e)  represents  a  “folded” 
line  segment  for  every  i  G  {1, 2, . . . ,  mj}. 

More  precisely,  for  each  i  G  {1, 2, . . . ,  if  we  let 

Ji(x,g,e)  =  {y  el :  |/^x)-/(y)l  <  e  for  G  {0, 1, . . . 

so  that  Di{x,g,e)  =  g\Ji{x,g,c)),  then  following  claim  is  true. 

Claim:  Given  g  e  U,  suppose  that  x  G  B{cj,e)  does  not  satisfy  condition  (I)  of 
lemma  B.2.7  for  any  n(x)  G  {1,2, . . .  ,mj}.  Then  for  each  j  G  {1,2,...,  9}  we  claim 
that  the  following  three  statements  are  true: 


(1)  For  any  i  G  {1, 2, . . . ,  m^},  if  we  define  yi{j)  G  Ji{x,g,  t)  such  that: 


inf.ej,(x,g,o  9*(2)  if  ^i(cj)  =  “1 


(B.16) 
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then 


Di{x,g,  e)  =  (fix)  -  ai{cj)e  ;  g\yiij))]  (B.17) 

and  g%yi{j))  €  {fix)  -  e,  /‘(a;)  +  e). 

(2)  For  any  z  G  {1,2, . . .  -  1},  Di{x,f,e)  n  C(f)  =  0. 

(3)  For  any  i  G  {1, 2, . . . ,  ruj},  yi(j)  G  Ji{x,  /,  e). 

Proof  of  claim:  We  prove  parts  (1)  and  (2)  of  this  claim  by  induction  on  i. 

First  we  demonstrate  that  if  conditions  (1)  and  (2)  above  are  true  for  each  i  G 
{1,2,...,  k}  where  G  {1, 2, . . . ,  —  1},  then  condition  (1)  is  true  for  i  +  1.  Thus 

we  assume  that  Dk{x,g,e)  has  the  form  given  in  {B.17),  if  a;  G  B{x,e),  so  that: 

Dk{x,g,e)  D  {f{x)  -  (Tk{cj)e  ;  /(x)]. 

Since  |/^(x)  —  g’‘{x)\  <  \e,  this  means: 

Dkix,g,e)  D  {f\x)  -  ak{cj)e  ;  f{x)  -  ^c^fc(cj)e]. 

In  particular  (/*^(x)-|crA:(cj))e  G  T»fc(x,^,e).  Since  D;k(x,/,e)  D  {f{x)-(Tk{cj)e  ;  /’(x)] 
and  Dk{x,f,  e)  fl  C{f)  =  0  (assuming  that  (2)  is  true  for  i  =  k)  we  know  that  [C{f)  n 
(/*(x)  —  |<rfc(cj)e  ;  /’(x))]  =  0.  Thus,  since  s  >  9  : 

-  Yk{cj)e)  G  (/*(x)  -  ^s(Tk+i{cj)t  -  4  ;  f{x)  -  ^sak+i{cj)e  +  4) 

Now  suppose  that  Cj  is  a  relative  maximum  of  the  map  so  that  cr*+i(cj)  =  +1  (the 
case  where  ak+i(cj)  =  —1  is  analogous).  Then  we  find  that: 

gifi^)  -  \<^k{cj)e)  <  f{x)  -  e 

where  g{f^{x)-\a-k{cj)e)  G  g{Dk{x,g,  e)).  Thus,  since  Dk{x,g,  e)  and  hence  5'(Dfc(x,5f,  e)) 
are  connected  sets,  this  means  that  since 

Dk+iix,g,e)  =  g{Dk{x,g,e))n  B{f'^^{x),e) 


we  know  that  /^(x)  —  e  must  be  the  lower  endpoint  of  Dk+i{x,g,  e).  Also  we  know  that 
Dk+i{x,g,e)  C  (/*‘^^(x)  -  e  ;  /*+^(x)  +  e) 

because  otherwise  condition  (I)  is  satisfied  for  n(x)  =  A;  +  1.  Consequently  by  the  defi¬ 
nition  of  yk{j)  in  (B.16),  we  see  that: 

Dk+i{x,g,e)  =  if'^^ix)  -  {cj)e  ;  /(?/fc+i(i))]. 
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where /(t/fc+i(i))  €  ;  /^+i(a:)  +  e)  if  cr,+i(c,)  =  +1.  Combing  this  with  the 

corresponding  result  for  crk^i{cj)  =  —1  proves  that  condition  (1)  is  true  for  i  —  A:  +  1 
given  that  (1)  and  (2)  are  true  for  i  =  k. 

Next  we  show  that  if  (1)  and  (2)  are  true  for  each  i  €  {1,2, where  k  € 
{1,2,...,  rrij  —  2),  then  (2)  is  true  for  i  =  k  +  1.  Suppose  on  the  contrary  that  (2)  is  not 
true  for  i  +  1  so  that  £)fc+i(a:,/, e)  H  C{f)  7^  0.  Since  Dk+i{x,  f,t)  C  B{f 
we  know  that: 

f+\x)eB{c,€)  (B.18) 

for  some  c  €  C{f).  From  (B.8)  and  (B.9)  we  also  know  that: 

f{cj)  ^  (c  ;  c+  ^(Ti(cj)e)  (B.19) 

for  any  c  €  C{f)  if  i  €  {1, 2, . . . ,  mj  —  2}. 

We  now  address  two  cases.  First  suppose  that  there  exists  some  i  G  {1,2, . . . ,  fc}  and 
c  €  C{f)  such  that: 

C  €  (/‘(i)  ;  f{ci))  (B.20) 

Let  t  be  the  minimum  value  for  which  (B.20)  holds  for  any  c  €  C{f).  Since  i  is  minimal 
we  know  that  /*  must  be  monotone  on  (x;  Cj)  so  that: 

^t(c,)(/‘(ci)  -  fix))  >  0. 

Combining  this  result  with  (B.20)  and  (B.19)  we  find  that: 

<^.(oi)(/'(c,)  -  /‘(x))  >  JC.  (B.21) 

Now  suppose  there  exists  no  i  €  {1,2,  —  ,k},  such  that: 

c  G  (fix)  ;  ficj)) 

for  any  c  G  Cif).  Note  that  since  we  assume  (2)  is  true  for  i  <  k,  this  means  there  exists 
no  i  G  {1,2,  ...,fc},  such  that: 

c  e  (fix)  ;  f  (cj))  U  Di{x,  f,  e). 

for  any  c  G  C{f).  Then  for  any  i  e  {1, 2, . . . ,  A:  +  1},  we  know  that  f  is  monotone  on 
(x;  Cj)  U  Ji(x,  /,  e).  Thus,  for  any  G  Di{x,  /,  e)  we  have: 

c^iicfificj)  -z)>0 


128 


and  from  (B.18)  and  (B.19): 


ak+i(cj){f-^\cj)  -  f+\x))  >  ^e. 


(B.22) 


From  (B.21)  and  (B.22)  we  have  shown  that  if  (2)  is  satisfied  for  any  ^  €  {1, 2, . . . ,  fc} 
then  there  exists  t  <  k  +  1  such  that: 


This  implies  that: 


-  f{x))  >  ^e. 


McjWicj)  -  f(x))  >  e 


so  Cj  ^  Jt{x,g,e).  Thus  there  exists  some  £  G  {0,1,.  —  1}  such  that  Cj  E  Ji{x,g,€) 

for  any  i  satisfying  \  <i  <  I  but  Cj  ^  J^+i(x,^,  e).  Since  Di(x,g,  e)  fi  C{f)  =  0  for  any 
i  €  {1, 2,. . .  we  know  that: 

>  0. 

Consequently,  since  Cj  ^  e),  it  is  apparent  that: 

-  f+^{x))  >  e. 

Thus,  since  D£{x,g,e)  is  connected,  and  since  g^'^^(cj)  E  g{D£{x,g,e),  we  know  that 
f^'^^{x)  +  (7i+i{cj)e  must  be  an  endpoint  of  Di+i{x,g,t)  =  g{Di{x,g,e)  n  B{f^{x),t) 
where  £+1  <t<^+l.  This  contradicts  (1)  for  i  =  i-\-\<k-\-l.  But  we  have 
already  shown  that  if  (1)  and  (2)  are  satisfied  for  i  €  {1,2, . . . ,  fc},  then  (1)  is  satisfied 
for  z  =  A:  +  1.  Thus  if  (1)  and  (2)  are  satisfied  for  i  E  {1, 2, . . . ,  A:},  then  (2)  is  also 
satisfied  for  z  =  A:  +  1. 

We  now  need  to  show  that  (1)  is  true  for  z  =  1.  By  definition,  we  can  write: 
Di{x^g,  e)  =  gl(x—e,x+€)]nB(f(x),  e).  If  condition  (I)  is  not  satisfied,  then  D\{x^g^  e)  C 
(/(x)  —  e,  f(x)  +  e)  and  at  least  one  endpoint  of  Z?i{x,  g,  e)  has  to  correspond  either  to  a 
maximum  or  minimum  point  of  g  in  the  interior  of  Ji{x,g,  e).  Since  s  >  9,  and  since  all 
the  turning  points  of  /  are  separated  by  at  leeist  16e,  we  know  that  the  other  endpoint 
of  Di{x,g,e)  must  be  f{x)  —  (Ti(cj)e.  Thus  Di{x,g^e)  has  the  form  given  in  (B.17). 

Now  we  show  that  (2)  is  true  for  z  =  1.  Suppose  that  Di{x,g,e)  D  C{f)  ^  0.  Then 
ai{cj){f{x)—c)  <  e  for  some  c  €  C{f).  Ifx  G  B{cj,e)  and  mj  >  1  then  cri{cj){f{cj)—c)  > 
|e  for  any  c  G  C{f).  Thus  ai{cj){f{cj)  —  f{x))  >  |e  which  means  that  cri{cj){g{cj)  — 
f{x))  >  e.  This  contradicts  (1)  for  z  =  1  and  completes  the  proof  of  parts  (1)  and  (2)  of 
the  claim. 

We  now  show  that  condition  (3)  of  the  claim  holds.  Suppose  on  the  contrary  that 
there  exists  x  E  B{cj,e)  for  some  j  E  {1,2,  ...,9}  such  that  yi{j)  ^  Ji{x,f,e)  for 
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some  i  e  {1,2, . . . ,  my}.  Then  there  exists  a  €  {0, 1, —  1}  such  that  yk+i{j)  ^ 
Jk+i{x,  f,  e)  but  ye{j)  €  Jfc(x,  /,  e)  for  any  integer  I  satisfying  1  <  .^  <  fc.  We  know  that; 

/'=+^(x)  +  6). 

And,  since  \f+^{yk+i{j))  -  g’^’^^Vk+iiM  <  we  find  that: 

e  {f^\x)-e-S.,f^Hx)-e) 

U  {f'^\x)  +  e  ,  f+\x)  +  e  +  S)  (B.23) 

g'^^^yiU))  e  (/+^(x)-6 ,  /^+^(x)-e  +  <^.) 

u  {f+^{x)  +  e-S^  ,  f-^\x)  +  e).  (B.24) 

Also,  substituting  f  =  gin  part  (1)  of  the  claim,  we  can  see  that: 

Di{x,  /,  e)  =  (fix)  -  cri{cj)e  ;  f  (cy)]  (B.25) 

where  f  (cy)  G  {fix)  -  e  ,  f{x)  +  e)  for  any  i  €  {1,2, . . .  ,my}  provided  condition  (I) 
of  the  lemma  is  not  satisfied.  Now  suppose  o'fc+i(cy)  =  +1  (the  other  case  is  analogous). 
Then,  since  yi{j)  €  Jk{x,f,^),  we  know  that  it  cannot  be  true  that  /  ~ 

yfc+i(’a;)  +  since  that  would  contradict  (B.25).  Thus  we  can  drop  one  of  the  intervals 
in  each  the  unions  in  (B.23)  and  (B.24).  In  particular  we  find  that. 

/■^^(j/i(i))  €  if'^^ix)  -  <Tk+i{cj)  ;  f'^^ix)  -  crfc+i(cy)(e  -  ^x))-  (B.26) 


This  implies  i  ^  k  +  1  since: 

if  o-fc+i(cy)  =  +1:  g^^^{yk+i{j)) 
if  cTfc+i(cy)  =  —1:  g^'^^{yk+i{j)) 


sup  g 
inf  g 

z^Jk+l  (x,5,£) 


fc+i 


fc+i 


(z)  >  f+fx)  >  /+'(y.(j)) 
(z)  <  f+fx)  <  g'^^\yi{j)). 


But  since  Dk+i{x,  /,  e)  n  C{f)  =  0  for  +  1  <  my  we  know  from  (B.25)  that: 
(/'=+^(a:)  +  cT,+:(cy)6)  ;  f^\x))  fl  C{f)  =  t 


Thus  from  (B.26),  since  s  >  9,  it  is  clear  that 

g'^^\yi{j))^Dk+2{x,g,e). 

This  means  that  yi{j)  i  Ji{x,g,t)  ior  any  i  >  k  +  2,  so  i  <  k  +  1.  But  we  have 
already  shown  that  i^^k  +  l.  Therefore  i  <  k.  But  this  contradicts  our  assumption  that 
g  {0, 1, . . . ,  i  —  1}.  This  proves  condition  (3)  and  completes  the  proof  of  the  claim. 

Returning  to  the  proof  of  lemma  B.2.7  we  now  assert  that: 

E,.,{x,g,e)  2  (r’{x)  -  8<7„j(c,-)e  .  O'))!-  (B.27) 
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if  X  does  not  satisfy  condition  (I)  of  the  lemma  for  any  n(x)  €  {1, 2, . . . ,  my}.  It  is 
clear  that  Di{x,g,e)  C  Ei{x,g,e)  for  each  each  i  €  {1,2, . . . ,  my).  We  also  know  that 
|/(a:)  —  5'(a:)|  <  for  all  a;  €  /  so  that  given  the  form  of  Di{x^g,  e)  in  (B.17)  and  because 
of  the  expansion  factor,  s  >  9,  we  have  that: 

Ei+i{x.g,e)  D  g{Di{x,g,  e))  n  +^(a:),  8e). 

for  any  i  G  {1,2, ...,my  —  1}.  Setting  i  =  nij  —  1,  and  substituting  Di{x,g,e)  in  the 
equation  above  using  (B.17),  we  get  (B.27). 

Now  suppose  that  CTmjicj)  =  +1  (the  case  where  cr„j^.(cy)  =  —1  is  analogous).  Then, 
from  (B.IO): 


/™>(cy)-CV(y)>104.  (B.28) 

Also,  if  condition  (I)  is  not  satisfied  for  some  x  G  B(cy,  e),  then  since  ymj{j)  ^  Dm,  (a^,  /,  e) 
we  know  that  /"^^(cy)  >  /™^(2/m,(i))  since  D  C{f)  =  0.  Thus,  because 

\f'-i{x)-g^^{x)\<S,: 

r^(ym,(i))-r>(cy)  <  (r^(2/m,(i))  +  <^.)-r^(cy) 

<  inici) + 6.)  -  nicj) 

<  4 

r^(j/n.,0))  -  nicj)  >  g-^{Ci)  -  r>(cy)  >  -4. 


(B.29) 

(B.30) 


Note  that  /  has  either  a  local  maximum  or  a  local  minimum  at  cv(y).  For  definiteness, 
assume  that  /  has  a  local  maximum  at  Cr(y)  (the  other  case  is  again  analogous).  Then, 
since  \f{x)  —  g{x)\  <  Sx  for  all  a:  G  /,  there  exists  a  local  maximum  of  the  map,  g,  at 
yiifU))  such  that: 


g{yi{r{j)))  =  sup  g{x)  (B.31) 

xeB(Cr(»,8E) 

and  yi{r{j))  G  B{cr{j),2—).  (B.32) 

s 

since  the  turning  points  of  /  are  separated  by  at  least  16e  distance. 

Consequently  from  (B.28),  (B.30),  (B.32),  and  since  5  >  9  we  see  that: 

g'^’iymjU))  -  yi{r{j)) 

=  [Cr(i)  +  -  C,(y))  +  {g’^^iymjij))  -  /'”^(cy))]  -  [c^(y)  +  (yi(r(i))  -  c,.(y))] 

>  [Cr(y)  +  106^j;  -  4]  -  [Cr(j)  +  2—)] 

>  0.  (B.33) 
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Also,  from  (B.29),  (B.ll),  and  (B.32)  and  since  s  >  9  and  ^ 

=  {g"^{yr.,ij))  -  n{cj))  +  -  ^ru))  -  (^Ki))  -  yMm 

<4  +  |f-2^ 

^  s 


(B.34) 


Consequently,  from  (B.33),  (B.34),  and  (B.27)  we  see  that  if  i  €  B(c„  e)  does  not  satisfy 
condition  (I),  then 


yi{r{j))  e  Emj{x,g,e). 

Furthermore,  from  (B.31)  we  also  know  that: 

g{yi{r{j)))  =  sup  giz). 

zeEmj  {x,g,e) 


(B.35) 


(B.36) 


It  we  assume  =  +1,  then  from  (B.27),  (B.29),  (B.ll),  (B.32),  and  since  s  >  9 


and  <  ie  we  have: 


g^^{x)  <  g^^^iymiij)) 

5 

<  <V(j)  +  2^  + 

<  yi(^(i))  +  +  -e  + 

<  yi(^(i))  +  3e 


(B.37) 


Still  assuming  (Tm,(cj)  =  +1,  then  from  (B.27),  (B.36),  (B.37),  and  since  6^  <  Je,  and 
|/(x)  -  g{x)\  <  6^  for  all  x  €  /  : 

giEm,{x,g,e))  D  {g{g^’{x)-Se)  ,  givMj))] 

2  (^(j/i(Ki))  -  Se)  ,  g{yMj))] 

2  {g{yi{r{j)))  -  Sse  +  6^  ,  £r(yi(r(j))] 


2  {g{yMj)))  -  ,  y(yi(r(i))] 


(B.38) 


Finally,  if  CTm^icj)  =  +1,  then  since  CrQ)  <  /’"^(cj)  <  Cr(j)  +  |e  and  s  >  9,  we  know  from 
(B.32)  that  Cr{j)  -  <  J/i(^(i)))  <  Cr(i)  +  3e.  Thus: 


g{B[r^{x),e])  C  (y(yi(r(j)))-4se-4 , 5(yi(r(i)))] 

2  (p(yi(Ki)))-?«^  ’  g{yMj)))] 


(B.39) 


Consequently,  from  (B.38)  and  (B.39),  we  have  that  if  a:  G  V  does  not  satisfy  condition 
(I)  of  lemma  B.2.7  for  any  n{x)  €  {1,2, . .  then: 

g{Em,{x,g,e))Dg{B[f”^^{x),e]), 

satisfying  condition  II  of  the  lemma.  We  already  saw  that  condition  I  of  the  lemma  is 
satisfied  for  n(x)  =  1  ii  x  £  I  \  V.  This  proves  lemma  B.2.7. 

Proof  of  theorem  3.2.1: 

Strong  linking  condition  — >  Function  shadowing:  Note  that  (B.6)  in  lemma  B.2.5  may 
be  rewritten  as: 


Dnw(x.s,e)2Blr''*W,£) 

and  (B.7)  in  lemma  B.2.6  may  be  rewritten  as 

so  we  can  see  these  two  statements  are  the  same  as  conditions  in  lemma  B.2.7. 

For  any  x  €  /,  condition  (I)  of  lemma  B.2.7  implies  that  condition  (II)  must  also 
be  true,  since  clearly  En(x){x,g,^)  2  Dn{x){x,g,e).  Thus,  combining  lemmas  B.2.7 
and  B.2.6,  we  see  that  if  /  :  /  -^  /  is  uniformly  piecewise  linear  with  s  >  9  and 
the  strong  linking  property,  then  /  must  satisfy  the  function  shadowing  property  on 
O’ (/,/).  Furthermore,  using  lemma  B.2.3,  we  can  drop  the  requirement  that  s  >  9. 
We  can  do  this  since  s  >  1  for  any  uniformly  piecewise  linear  map  /,  so  there  always 
exists  n  >  0  such  that  /"  is  uniformly  piecewise  linear  and  satisfies  s  >  9.  Thus,  from 
lemmas  B.2.1  and  B.2.2,  we  know  that  any  transitive  map  /:/—?■/  with  the  strong 
linking  property  must  also  satisfy  a  the  function  shadowing  property  on  0’(/,  I). 

Function  shadowing  Strong  linking  condition:  Suppose  that  /  is  a  piecewise  linear 
map  that  does  not  satisfy  the  strong  linking  condition.  We  shall  first  show  that  /  does 
not  satisfy  the  function  shadowing  property  on  O’ (/,/). 

If  /  does  not  satisfy  the  strong  linking  condition,  then  there  is  a  c  €  C{f)  and  Cq  >  0 
such  that  there  exists  no  z  €  {B(c,  e)  \  c}  and  n  £  satisfying  E^{z)  £  C{f)  and 
|/*(c)  —  /*(z)|  <  Co  for  every  i  £  {1,2, .. .  ,n}.  We  will  show  that  if  e  €  (0,  |eo),  then  for 
any  ^  >  0  there  exists  a,  g  £  C9(/,  I)  that  satisfies  d{f,g)  <  8  but  has  the  property  that 
no  orbit  of  g  e— shadows  the  orbit,  {/*(c)}^o,  of  /. 

Now  given  ^  >  0  and  e  <  |eo5  choose  g  to  be  any  map  that  satisfies  the  following 
properties: 


(1) ^ee(/,/) 

(2)  g{c)  =  /(c)  -  ax{c)8 
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(3)  g{x)  =  f{x)  for  any  x  €  {/  \  jB(c,eo)}. 

(4)  sup:,^B{c,c)Wi{c)g{x)]  =  (Ti{c)g{c) 

(5)  d{f,g)<8 

Set  Xi  =  f{c)  and  let  y,-  =  y’(c)  so  that  {yj  is  an  orbit  of  g.  Suppose  that  A:  €  Z+  such 
that  (Ti{c){xi  -  Vi)  <  Co  for  all  i  €  {0, 1, ,  k}.  We  assert  that 

(Ti{c)(xi  -  yi)  >  (B.40) 

for  any  i  e  {1, 2, . . . ,  A;  +  1}.  It  is  not  hard  to  show  this  assertion  by  induction.  For  any 
z  G  {1,2, . . . ,  A:}  we  have  that  C{f)  fl  (xi;y,)  =  0  and  <Xj+i(c)(/(yi)  —  g{yi))  >  0-  Thus, 
since  cri+i(c)(/(x0  -  /(yi))  =  s(ri(c){xi  -  yi),  we  have  that 

cr,+i(c)(/(xi)  -  y(yi))  >  <Xi+i(c)(/(xi)  —  /(yi))  =  5cri(c)(xi  -  yi)  (B.41) 

so  that  if  (B.40)  is  true  for  i,  then  it  also  must  be  true  for  z  +  1,  provided  that  z  G 

{1,2,..., A;}. 

But  {yi}fi'  does  not  e-shadow  {xi}^=+o^.  We  can  see  this  from  (B.40)  and  from  our 
choice  of  k,  since  e  <  \tQ.  Furthermore  there  is  no  orbit  of  g  that  more  closely  shadows 
{xi}f=J  than  {yiji^o  •  This  is  because  for  any  «  €  /,  if  *  6  {1, 2, . . . ,  A:}  and  u  €  c/i(c,y,  e), 
then  (y*(u);xi)  n  C{f)  =  0  since  e  <  |eo.  Also,  using  property  (4)  of  our  choice  of  g, 
we  can  show  that  5up2gj,.(c,5,«)[o'i(c)y‘(2)]  =  <7i(c)y*(c)  for  any  z  G  {1, 2, . . . ,  A:  +  1}  by 
induction  on  i. 

Consequently,  if  /  is  a  piecewise  linear  map  that  does  not  satisfy  the  strong  linking 
condition,  then  it  cannot  satisfy  the  function-shadowing  in  0^(7, /).  Since  the  function 
shadowing  property  is  preserved  by  topological  conjugacy  (lemma  B.2.2)  this  implies 
that  a  transitive  piecewise  monotone  map  cannot  exhibit  function  shadowing  in  0^(7, 7) 
if  it  does  not  satisfy  the  strong  linking  condition. 

This  concludes  the  proof  of  theorem  3.2.1. 
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Appendix  C 

Proof  of  theorem  3.3.1 


This  appendix  contains  the  proof  for  theorem  3.3.1.  I  have  made  an  effort  to  make  the 
appendix  as  self-contained  as  possible,  so  that  the  reader  should  be  able  to  find  most  of 
the  relevant  definitions  and  explanations  in  this  appendix.  Naturally,  this  means  that 
the  appendix  repeats  some  material  found  elsewhere  in  this  report. 


C.l  Definitions  and  statement  of  theorem 

We  first  repeat  the  related  definitions  which  are  the  same  as  those  found  in  chapter  3. 
Throughout  this  appendix  we  shall  assume  that  /  C  K  represents  a  compact  interval  of 
the  real  line. 

Definitions:  Suppose  that  f  :  I  I  is  continuous.  Then  the  turning  points  of  /  are 
the  local  extrema  of  /  in  the  interior  I.  C (/)  is  used  to  designate  the  set  of  all  turning 
points  of  /  on  I.  Let  C  (/,  I)  be  the  set  of  continuous  maps  on  I  such  that  /  G 
C‘^{Ii  I)  if  the  following  three  conditions  hold: 

(a)  /  is  C'’’  (for  r  >  0) 

(b)  M  £  I- 

(c)  C  Bd{I)  (where  Bd{I)  denotes  the  boundary  of  /), 

If  /  G  C (/,  I)  and  5r  G  C (/,  /),  let  d(/,5r)  =  sup^^j  j/(x)  -  g{x)\. 

We  will  primarily  restrict  ourselves  to  maps  with  the  following  properties: 

(CO)  5^  :/—>/,  is  piecewise  monotone. 

(Cl)  g  is  on  I. 

(C2)  Let  C{g)  be  the  finite  set  such  that  c  G  C{g)  if  and  only  if  g  has  a  local  extremum 
at  c  G  /.  Then  g"{c)  7^  0  if  c  G  (7(5)  and  g'{x)  0  for  all  x  G  /  \  <7(5). 
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Under  the  Collet-Eckmann  conditions,  there  exist  constants  Ke  >  0  and  Ae  >  1 
such  that  for  some  c  G  C{g): 

(CEl)  \Dg-{g{c))\>  KeXI 
(CE2)  \Dg^{z)\  >  KeXI  if  g^{z)  =  c. 

for  any  n  >  0. 

We  consider  one-parameter  families  of  mappings,  fp  :  Ix  h,  parameterized  by 
p  e  Ip,  where  4  C  E  and  IpCR  are  closed  intervals  of  the  real  line.  Let  f{x,p)  =  fp{x) 
where  f  :  Ix  x  Ip  Ix-  We  are  primarily  interested  in  one-parameter  families  of  maps 
with  the  following  characteristics: 

(DO)  For  each  p  e  Ip,  fp  •  h  ^  h  satisfies  (CO)  and  (Cl).  We  also  require  that  C{fp) 
remains  invariant  with  respect  to  p  for  all  p  G  Ip. 

(Dl)  f  :  Ix  X  Ip  ^  Ix  is  for  all  {x,p)  G  4  x  Ip. 

Note  that  the  following  notation  will  be  used  to  express  derivatives  of  /(a:,p)  with  respect 
to  X  and  p. 

Dxf{x,p)  - 
Dpf{x,p)  = 

The  Collet-Eckmann  conditions  specify  that  derivatives  with  respect  to  the  state, 
X,  grows  exponentially.  Similarly  we  will  also  be  interested  in  families  of  maps  where 
derivatives  with  respect  to  the  parameter,  p,  also  grow  exponentially.  In  other  words, 
we  require  that  there  exist  constants  Kp  >  0,  \p  >  I,  and  iV  >  0  such  that  for  some 
Po  €  Ip,  and  c  G  C{fpo): 

(CPI)  \Dpr{c,Po)\>  Kp\; 

for  all  n  >  N.  This  may  seem  to  be  a  rather  strong  constraint,  but  in  practice  it  often 
follows  whenever  (CEl)  holds.  \Ve  can  see  this  by  expanding  with  the  chain  rule. 

Dpr{c,po)  =  Dxf{r~\c,po),Po)Dpr-\c,po)  +  Dpf{r-\c,po),Po)  (C.3) 

to  obtain  the  formula  for  Dpf'^{x,po)  : 

Dpr{x,po)  =  Dpf{r-\c,po),po)+'^[Dpf{f{c,po),Po)  H  Dxf{f{c,po),Po)]- 

i=0  j=»+l 


Thus,  if  \Dxf"'{f{c,po),po)\  grows  exponentially,  we  expect  \Dj,f'^{x,po)\  to  also  grow 
exponentially  unless  the  parameter  dependence  is  degenerate  in  some  way. 

Now  for  any  c  €  C{fpa)  define  crn(c,p)  recursively  as  follows: 

(T„+i(c,p)  =  sgn{Dxfir{c,p),p)}an{c,p) 

where 

,  ._riifcisa  relative  maximum  of  fp 
i  -1  if  c  is  a  relative  minimum  of /p 

Basically  crp(c,p)  =  1  if  has  a  relative  maximum  at  c  and  cr„(c,p)  =  —  1  if  has  a 
relative  minimum  at  c.  We  can  use  this  notion  to  distinguish  a  particular  direction  in 
parameter  space. 

Definition  C.1.1  Let  {fp  :  4  4|p  €  4}  be  a  one-parameter  family  of  mappings 

satisfying  (DO)  and  (Dl).  Suppose  that  there  exists  po  G  Ip  such  that  fp^  satisfies  (CEl) 
and  (CPI )  for  some  c  G  C{fpf).  Then  we  say  the  that  turning  point  c  of  fp^  favors  higher 
parameters  if  there  exists  N'  >  0  such  that 

sgn{Dpr{c,po)}  =  sgn{an{c,p)}  (C.4) 

for  all  n  >  N\  Similarly,  the  turning  point,  c,  of  fp^  favors  lower  parameters  if 

sgn{DpP{c,pQ)}  =  -sgn{an{c,p)]  (C.5) 

for  all  n  >  N'. 

The  first  thing  to  notice  about  these  two  definitions  is  that  they  are  exhaustive  if 
(CPI)  is  satisfied.  That  is,  if  (CPI)  is  satisfied  for  some  po  G  Ip  and  c  G  C{fpa),  then 
the  turning  point,  c,  of  fp^  either  favors  higher  parameters  or  favors  lower  parameters. 
We  can  see  this  from  (C.3).  Since  \Dpf{x,po)\  is  bounded  for  x  G  4,  if  |4p/"(a:,po)| 
grows  large  enough  then  its  sign  is  dominated  by  the  signs  of  Dxf{f^~^{c,po),Po)  and 
Dpf'^~^{c,po),  so  that  either  (C.4)  or  (C.5)  must  be  satisfied. 

Finally,  if  po  G  Ip  and  c  G  C{fpo),  then  for  any  e  >  0,  define  ne(c,  e,po)  to  be  the 
smallest  integer  n  >  1  such  that  |/”(c,po)  —  c*|  <  e  for  any  c*  G  ^(/pj.  We  say  that 
ne(c,  e,po)  =  oo  if  no  such  n  >  1  exists. 

We  are  now  ready  to  state  main  result  of  this  appendix. 

Theorem  3.3.1  Let  {fp  :  4  4b  ^  4}  ®  one-parameter  family  of  mappings 

satisfying  (DO)  and  (Dl).  Suppose  that  (CPI)  is  satisfied  for  some  po  G  4  ^  ^ 

C{fpg).  Suppose  further  that  fp^  satisfies  (CEl)  at  c,  and  that  the  turning  point,  c,  favors 
higher  parameters  under  fp^.  Then  there  exists  6p  >  0,  X  >  1,  K'  >  0,  and  K  >1,  such 
that  ifp  G  (po  -  bp,po),  then  for  any  e  >  0,  the  orbit  {fp^{c)}^-o  is  not  e— shadowed  by 
any  orbit  of  fp  if  \p  —  po\  > 

The  analogous  result  also  holds  if  fp,,  favors  lower  parameters. 
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C.2  Proof 


Lemma  C.2.1  Let  {/p  ;  4  4b  €  4)  one-parameter  family  of  mappings  sat¬ 
isfying  (DO)  and  (Dl).  Then  given  po  €  4,  there  exist  constants  74  >  0,  K2  >  0,  and 
Kz  >  0  such  that  the  following  properties  are  satisfied: 

(1)  \D^f{xuPo)  -  DJ{x2,Po)\  <  Ki\xi  -  X2\  for  any  and  X2  €  Ix- 

(2)  Let  8x  >  to  he  the  maximal  value  such  that  |x  — c*|  <  Sx  implies  |7)^/(a:,po)|  >  0 
for  any  c.  €  C{fp,).  Then  \Df{x,po)\  >  K2\x  -  c|  if  b  -  c]  <  Sx  for  some 

c  e  C{U. 

(3)  Fix  c  e  C{U,).  Then,  \DJ{x,p)  -  DJ{x,pq)\  <  Kz\x  -  clbi  -P2I  for  any  x  e  h 
and  p  G  Ip. 


Proof  of  (1):  (1)  is  true  since  f{x,p)  is  and  h  x  Ip  is  compact. 

Proof  of  (2):  From  (C2)  we  know  that  it  is  possible  to  choose  a  dx  >  0  as  specified.  Let 
c  €  (^(/po)  and  x  G  h-  By  the  mean  value  theorem: 

\Dxf{x,po)\  =  \Dlf{y,po)\\x  -  c\ 


for  some  y  G  [c;  x].  Now  set: 


inf 

^Sx,c+^Sx] 


\Dlf{y,Po)l 


From  our  choice  of  Sx,  we  know  K2  >  0-  Thus  if  |x  c  <  2^®)  have  that. 


\Df{x,po)\  >  2K2\x  -  c|. 

But  since  \Dlf{y,po)\  >  0  if  \x-c\  <  Sx,  it  is  evident  that  \Df{y,po)\  >  |7)/(x  + id,po)l 
for  any  y  G  (c+|^x,  c+^x).  Similarly  17)/(y,po)|  >  \Df{x—^S,po)\  iiy  €  {c—Sx,c—^Sx). 
Thus: 

14/(x,po)|  >  42lx  -  c| 


for  any  x  satisfying  b  —  c]  <  dx. 

Proof  of  (3):  Fix  c  G  4(4 J  and  po  G  Ip.  Then  for  any  x  G  7^  and  p  G  Ip,  let: 

q{x,p)  =  D,cf{x,p)  -  Dxf{x,po). 

Since  f  is  C'^,  q  must  be  C^.  It  is  clear  that: 

q{c,p)  =  0  (4-6) 
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for  all  p  E  Ip  and 


q{x,po)  =  0 


(C.7) 


for  all  X  E  Ix- 

From  (C.7)  and  since  q{x,p)  is  q{x,p)  satisfies  a  Lipschitz  condition  on  Ix  x  Ip 
so  that  there  exists  a  constant  C  >  0  such  that: 


\q{x,p)\  <  C\p-po\. 


for  any  (x,p)  E  Ix  'X  Ip-  Now  define 


r  ifp^O 

r{x,p)  =  l  P-Po 

(  Dpq{x,po)  itp  =  po 


(C.8) 


(C.9) 


Note  that  from  (C.8),  |r(x,p)|  <  C\Ip\  for  any  {x,p)  E  IxXlp  such  that  p  ^  po-  Since  r  is 
bounded  and  q(x,p)  is  (7^,  it  is  fairly  easy  to  check  that  r(x,p)  is  for  all  (x,p)  E  IxXip. 

From  (C.9)  and  (C.7),  we  see  that: 


q{x,p)  =  r(x,p)(p-po) 


(C.IO) 


for  all  (x,p)  E  Ix  X  Ip.  Also  from  (C.6)  we  know  r(c,p)  =  0  for  all  p  E  Ip.  Thus  since 
r(x,p)  is  (7^,  there  exists  ATs  >  0  such  that  |r(x,p)|  <  Kz\x  —  c\  for  any  (x,p)  E  Ix  x  Ip. 
Substituting  this  into  (C.IO)  we  find  that: 

|9(x,p)|  <  Kzlx  -  c\\p  -  po\ 

for  any  (x,p)  €  Ix  x  Ip.  This  proves  part  (3)  of  the  lemma. 

Lemma  C.2.2  Let  {fp  :  Ix  Ix\p  E  Ip}  be  a  one-parameter  family  of  mappings  sat¬ 
isfying  (CO)  and  (Cl).  Suppose  that  fp^  satisfies  (CEl)  for  po  E  Ip  and  some  turning 
point,  c  E  C{fpg).  Suppose  that  turning  point  c  of  fp^  favors  higher  parameters.  Given 
any  Aq  >  Ai  >  1,  there  exist  constants  K  >1,  >  0  and  Co  >  0  such  that  for  any  e  <  €q, 

^f\p-Po\  <  bp,  \P{c,p)  -  r{c,po)\  <  e,  and  |/’(c,po)  -  c»|  >  Ke  for  all  c,  E  C{fpP  and 
1  <  i  <  n  then: 


\Dx{r{c,p),p)\  ^  ^ 
\Dx{fic,po),PQ)\  Ao 


(C.ll) 


for  all  1  <  i  <  n. 


Proof:  We  first  describe  possible  choices  for  A'  >  1,  >  0,  and  eo  >  0.  We  then  show 

that  these  choices  in  fact  satisfy  (C.ll). 
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Fix  >  0  such  that 


Dlf{x,po)  ^  0  if  |x  -  C.I  < 


for  any  c,  €  C'(/po)-  Then  let: 

J^  =  {xe  41  \x  -  c,l  >  Sx  for  any  c*  e  C{fpo)}. 

Set  =  in4eJ.  \Dxf{x,Po)\  and  define: 

A{a)  -  sup  sup  \Dpf{x,p)  -  Dp{x,po)\. 

3:^Ix  p€[po“tijPO+®] 

Now  let  Ki  >  0,  ^2  >  0,  and  ^3  >  0  be  the  constants  from  lemma  C.2.1.  Choose: 


Note  that  since  Ki  >  K21  we  know  that  K  >1.  Choose  Spi  >  0  such  that: 

A(«P,)  <  ^(1  -  |). 


Let  Sp2  =  g(l  -  and  set 


6p  =  min{5pi,  6p2}- 


Finally,  fix 


eo  =  mm{—(l 


(C.12) 


(C.13) 


(C.14) 


(C.15) 


In  order  to  show  (C.ll)  it  is  sufficient  to  show: 

A(i,p,po)  <  1  -  V 


(C.16) 


where 

,  \Dxf{f{c,p),p)- Dxf{fic,po),Po)\ 

=  |B./(/‘(c.Po),J.o)l  ' 

For  each  1  <  i  <  n  we  now  consider  two  possibilities: 

(1)  14(c,p)  -  c*|  >  Sx  for  some  c,  €  C{fpp) 

(2)  Ke  <  \  f{c,po)  -  c*l  <  Sx  for  some  c,  €  C{fpo). 


(C.17) 
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(Note  that  we  know  Kt  <  8x  from  (C.15).) 

From  now  on  we  assume  that  \p  —  poi  <  |/*{c,p)  —  P{c,po)\  <  e,  and  |/®(c,po)  ~ 

c*|  >  Ke  for  all  c*  €  C'(/po)  and  1  <  e  <  n.  We  wish  to  show  that  (C.16)  is  true  for  both 
cases  (1)  and  (2)  above  for  each  I  <  i  <  n. 

In  case  (1)  using  (C.13),  (C. 14), (C.15),  (C.17),  and  lemma  C.2.1  we  have: 


+l^x/(f  (c,po),p)  -  D^f{f{c,po),po)\) 
■lVIx 

KiCo  Mx,  Ai, 

<  1C  "  v 
<1-^ 


(C.18) 


which  proves  the  lemma  for  case  (1). 

In  case  (2),  if  Ke  <  |/’(c,po)  —  c*|  <  for  some  c*  €  C'(/po)  then  from  lemma  C.2.1, 
(C.18),  (C.15),  and  (C.12): 


A{i,p,po)  < 


Ki\f{c,p)  -  f{c,po)\  +  ATsIf  (c,po)  -  c,||p  -  Pol 


^2l/*(c,Po)  -  c. 
Kie  Kslp-pol 
K2{Ke)  ^  K2 


<-r 


This  proves  the  lemma. 


Lemma  C.2.3  Suppose  that  there  exist  constants  (7  >  0,  No  >  0  and  Ao  >  1  such  that 


\D,f{c,po)]  >  CAJ 


(C.19) 


for  all  i  >  No  where  po  €  Ip-  Suppose  also  that  there  exists  dp  >  0  and  Aj  G  (1,  Ao)  such 
that  for  some  n>  Nq: 


\DJ{f{c\p)\  .  Ai 

P./(/’(c),Po)|  Ao 


(C.20) 
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for  all  I  <  i  <  n  if  \p-  po\  <  Sp.  Then  for  any  A2  E  (l,Ai),  there  exists  Ni  >  0 
(independent  of  n  and  6p)  and  Spi  >  0  (independent  of  n)  such  that 

\D,f{c,p)\  >  CXi 

for  all  Ni  <  i  <  n  +  I  if\p  —  Po\<  ^Pi- 

Proof:  Given  Aq  >  1,  fix  1  <  A2  <  Ai  <  Ao-  Set  Mp  =  \Dpf{x,po)\  and  define: 

4i)  =  (^  -  l)C„A-«  -  -  2Af,  (C.21) 

Aq  A2  Aq 

It  is  apparent  that  z{i)  00  as  i  00.  Thus,  it  is  possible  to  choose  A^2  >  0  (indepen¬ 
dent  of  n  and  8p)  so  that  z{i)  >  Kq\Ip\  for  all  i  >  N2  where  Kq  >  0  is  the  constant  from 
lemma  C.2.1  such  that: 

\Dpf{x,p)  -  Dpf{x,po)\  <  Ko\p  -  Pol 

for  any  x  G  Ix  and  p  €  Ip.  Let  Ni  =  max{iVo,7V2}. 

We  now  prove  the  lemma  by  induction  on  i  for  iVj  <  i  <  n.  From  (C.19),  and  since 
\Dpf{c,p)\  is  continuous  with  respect  to  p,  there  exists  6p2  >  0  such  that 

\Dpf^{c,p)\>CX’^^  (C.22) 

if  Ip  _  Pol  <  Sp2.  Set  8pi  =  mm{6p,8p2}.  Thus,  since  5pi  >  0  is  independent  of  n,  to 
prove  the  lemma  it  is  sufficient  to  show  that: 

|■Pp/*(c,p)|  (C.23) 

|L?p/*(c,po)l  ^Ao^ 

implies 

lL'p/‘'^Hc,Po)|  ^Ao'^ 

for  any  |p  —  pol  <  8pi  if  Ni  <  i  <  n. 

Let  E  =  and  let  A  =  |L>./(f  (c,po),Po)L>*(c,Po)|.  Then,  expanding  by 

the  chain  rule: 

|L>pf+Hc,Po)l 

\Dxf{r{c,p),p)Dpf*{CjP)\  —  \Dpf{f  (C;P);P)I  (C.24) 

|Z)a;/(/*(c,  po),Po)L>p/*(c,  Po)l  -f  |■Dp/(/’(C)Po)5Po)l 
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Using  (C.20)  and  (C.23): 


\^xf{r{c,p),p)Dpr{c,p)\ 

=  (r)‘" 

Aq  A2 

Also,  we  know  for  lemma  C.2.1  that  there  exists  >  0  such  that: 


(C.25) 


\Dpmc,p),p)\ 

<  \Dpf{f{c,p),p)  -  Dpfif{c,p),po)\  +  \Dpfif{c,p),po)  -  Dpf{f{c,po),po)\ 


<  Kq\p-po\  +  2Mp 

Thus,  substituting  (C.25)  and  (C.26)  into  (C.24): 


\  ^  (^2i2y‘+i('AL 

^  ^ ^  '^0  ^  -^2 

^0 


A  +  Mp 

-  1)-^  -  (^°lf  -M  +  2M,)- 
A  +  Mp 


Since  \Dpp'^^{c,po)\  <  A  +  Mp  and  from  (C.19)  we  have  that 

A  >  CoA’+'  -  Mp 

Substituting  (C.28)  into  (C.27)  and  from  (C.21)  we  have: 

..  -  l)CcXi»  -  -  2M,  -  Ko\p  -  Pol 

W  +  aTW, - 


>  z{i)-Ko\p-po 

Ao  A  +  Mp 

Since  z(i)  >  Ko\p  —  po\,  for  *  >  iVj,  we  have  that: 

E  >  ()^)'+‘, 

if  Ai  <  i  <  n  which  proves  the  lemma. 


(C.26) 


(C.27) 


(C.28) 


Lemma  C.2.4  Let  {fp  :  /j,  Ix\p  G  Ip}  he  a  one-parameter  family  of  mappings  satis¬ 
fying  (CO)  and  (Cl).  Suppose  that  fp^  satisfies  (CEl)  and  (CPI)  for  po  6  Ip  and  some 
c  €  C{fpf).  Then  there  exist  constants  to  >  0,  K  >  I,  Ni  >  0,  X  >  1,  and  Sp  >  0  such 
that  for  any  positive  e  <  cq,  if  p  €  B(po,Sp)  then  for  any  n  <  ne(c,  e,po)  the  following 
two  conditions  are  true: 


(1)  If  If  (c,p)  -  f  (c,po)|  <  e  for  every  \<i<n,  then 

\D,P{c,p)\  >  CX^ 

for  any  <  j  <  n  +  1 . 


max  If  (c,p)  -f  (c,po)!  >  mm{e,CX'\p  -  po\}- 

Ni<i<n 

Proof:  If  f{x,po)  for  c  G  C{fp,)  then  there  exists  C  >  0,  iVo  >  0,  and  Ao  >  0  such  that; 

\Dr>f{c,po)\  >  CXi 

for  all  i  >  No.  Choose  A  and  Ai  such  that  1  <  A  <  Ai  <  Aq.  Then  from  lemma  C.2.2 
we  know  that  there  exists  K  ^  1,  hp\  >  0,  and  cj  >  0  such  that  for  any  e  <  ci,  if 
p  G  B{po,Sp^),  n  <  ne{c,  Ke,po),  and  |f  (c,p)  -  f  {c,po)|  <  e  for  1  <  i  <  n,  then: 

\Dxif{c,p),p)\  ^  ^ 

\D^{P{c,Po),Po)\  Xo 

for  any  1  <  i  <  n.  From  lemma  C.2.3,  this  implies  that  there  exists  eo  >  0,  6p2  >  0, 
and  >  0  such  that  for  any  e  <  cq,  if  p  €  B{po,Sp2)  and  |f  (c,p)  —  f  (c,po)|  <  ^  for 
1  <  i  <  n,  then: 


\Dr{c,p)\  >  cy 


(C.29) 


for  any  j  satisfying  Ni  <  j  ^  ^  f  provided  that  n  <  ne[c,  Ke,po).  This  proves  part 
(1)  of  the  lemma.  It  also  implies  that 


|f(c,p)-f(c,po)l>C'A*|p-po| 


(C.30) 


for  any  A^i<i<n  +  lifra<  Ue(c,  Ke,po). 


Now  define: 


9{p)  =  max  If  (c,p)  -  f  (c,po)| 


for  any  p  G  Ip.  Since  /(x,p)  is  and  \Dpf^^{c,po)\  >  CX^\  there  exists  8p3  >  0  such 
that  g{p)  is  monotonically  increasing  in  the  interval  [po,Po  +  ^Ps]  and  monotonically 
decreasing  in  the  interval  [po  —  ^Ps^Po]-  Choose  Sp  =  min{Sp2,8p3}. 

Now  fix  e  <  Co-  For  each  n  >  0,  define  Jn  to  be  the  largest  connected  interval  such  that 
p  ^  Jn  implies  that  |f  (c,p)  —  f  (c,po)|  <  c  for  1  <  f  <  n,  po  G  Jn-,  and  C  B{po-i8p). 
In  order  to  prove  part  (2)  of  the  lemma  it  is  sufficient  to  show  that  for  any  p  G  B{po,  6p) 
\fNi<n<  ne{c,  Ke,po),  then  either  (a)  p  G  which  implies  |f  (c,p)  -  f  (c,po)|  >  CX^ 
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for  all  A^o  <  *  <  w  or  (b)  p  ^  Jn  which  implies  that  |/‘(c,p)  —  /’(c,po)|  >  e  for  some 
Ni  <  i  <  n.  Case  (a)  has  already  been  proved  above  (see  (C.30)).  We  now  prove  case 
(b). 

First  of  all  note  that  by  our  choice  of  Sp  and  J„,  if  p  6  B(po,  Sp),  then  either  p  €  Jni 
or  |/*(c,p)  —  /’(c,po)|  >  e  for  some  1  <  i  <  Ni.  Now  fix  pi  €  B{po,S)  and  suppose 
that  Pi  ^  Jn,  for  some  n  satisfying  Ni  <  n  <  ne{c,  Ke^po).  Then,  since  Jj  D  Jj+i  for 
all  i  >  Ni,  we  know  that  if  there  exists  k  <  n  such  that  pi  €  Jjt  \  Jk+i  where  Ni  < 
k  <  ne{c,Ke,po).  But  for  any  p  G  we  know  (see  (C.29))  that  |D/*'^^(c,p)|  > 

Thus  {f^'^^{c,p)  —  /^'''^(cjPo))  must  be  monotone  with  for  all  p  G  J^.  Consequently  if 
Pi  €  Jfc  \  Jk+i  then  |/*+^(c,pi)  -  /*'+^(c,po)|  >  e  where  Ni  <  k  <  ne{c,  Ke,po).  This 
proves  the  lemma. 


Lemma  C.2.5  Let  {fp  :  Ix  Ix\p  €  Ip}  be  a  one-parameter  family  of  mappings  satis¬ 
fying  (CO)  and  (Cl).  Suppose  that  fp^  satisfies  (CEl)  for  some  po  G  Ip  and  c  G  C'(/po). 
For  any  p  £  Ip  and  n  >  0  define: 

K(p,  e)  =  {a:  G  4|  \r{x,p)  -  /‘(c,Po)|  <  e,  for  all  0  <  i  <  n} 

Then  there  exists  e©  >  0  such  that  for  any  positive  e  <  Co,  and  any  I  <  n  <  ne{c,  e,po)  : 

sup  {<Tn(c,po)/"(x,p)}  <  (r„(c,po)/”(c,p).  (C.31) 

xSVn{p,t) 


Proof:  Proof  by  induction.  Suppose  that  the  elements  of  C{fpg)  are  Ci  <  C2  <  . . .  <  c^, 
for  some  m  >  1.  Assume  that 

eo  <  ,  min  |c,+i-Ci| 

In  this  case,  (C.31)  clearly  holds  for  n  =  1  since  cri(c,po)  =  1  implies  that  c  is  relative 
maximum  of  fp^  and  cri(c,po)  =  —  1  implies  that  c  is  relative  minimum  of  fp^.  Now 
assuming  that  (C.31)  holds  for  some  n  =  k  where  1  <  k  <  ne(c,  e,Po);  we  need  to  show 
that  (C.31)  holds  for  n  =  A;  +  1. 

Since  k  <  n^{e).,  |/*(c,po)  —  c,|  >  e  for  any  i  G  {1,2,...  ,m}.  Consequently,  since 
|/*(a;,p)  — /^(c,po)|  <  eforanyx  G  14  (p,  e),  we  see  that  there  exists  i  G  {1,2,...  ,m— 1} 
such  that  Ci  <  X  <  Cj+i  for  every  x  G  T4(p,  c).  In  other  words,  all  elements  of  I4(p,  e) 
must  lie  on  one  monotone  branch  of  fp  and: 


sgn{Df{f{x,p),p)}  =  sgn{Df{f{c,po),po)} 

for  all  X  G  14  (p,  e). 

From  our  specification  of  crk(c,po)  we  have  that: 

(Tk+iic,po)  =  5pn{i)/(/*(c,po),po)}tTfc(c,po). 


(C.32) 


(C.33) 
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We  can  consider  four  cases:  sgn{Df{f^{c,po),po)}  —  ±1  and  ak{c,po)  —  ±1.  Suppose 
that  crfc(c,po)  =  1-  By  assumption,  if  (Tk{c,po)  =  1,  then 

sup  nx,p)  <  r{c,p).  (C.34) 

a:eV„(p,e) 

Thus,  if  sgn{Df{f''{c,po),po)}  =  1,  then,  from  (C.33),  (Tk+iic,po)  =  1-  Also,  from 
(C.32),  we  know  that  sgn{Df  lf{x,p),p)}  =  1  for  all  x  e  Vk{p,e),  and  we  know  that 
all  elements  of  Vk{p,  e)  lie  on  a  monotonically  increasing  branch  of  fp.  Combining  this 
result  with  (C.34)  implies  that: 

sup  f^\x,p)<f'^\c,p). 

x&Vk+i{p,c) 

On  the  other  hand,  if  sgn{Df{f^{c^po),Po)}  =  then  crfc+i(c,po)  =  and 

inf  p). 

x€Vk^i(p,() 

In  both  cases  above  we  can  see  that  (C.31)  is  satisfied  for  n  =  k  +  1.  Similarly  we  can 
verify  that  (C.31)  is  also  satisfied  forn  =  A:  +  1  in  the  two  cases  where  ak{c,po)  =  -1. 
This  proves  the  lemma. 

Proof  of  theorem  3.3.1: 

We  are  given  that  fp^  satisfies  (CEl)  for  some  po  €  Ip  and  c  6  C{fpo).  Then,  from 
part  (1)  of  lemma  C.2.4,  there  exist  constants  K  >  1,  C  >  0,  N2  >  0,  eo  >  0,  Sp  >  0, 
and  A  >  1  such  that  for  any  e  <  co,  if  P  €  B{po,Sp),  and  |/*(c,p)  —  /*(c,po)|  <  £  for  all 
i  satisfying  1  <  i  <  n  —  1,  then: 

\Dpnc,p)\  >  CA”  (C.35) 

for  any  n  such  that  N2  <  n  <  ne{c.,  Ke,po). 

Now  suppose  that  there  exists  c  €  C{fpo)  that  favors  higher  parameters.  Then  there 
exists  As  >  0  such  that  for  any  n  >  As  : 

sgn{Dpr{c,pQ)}  =  (7n{c,po).  (C.36) 

Set  Ai  =  max{ As,  As).  From  (C.35)  and  since  /  is  it  is  clear  that  Dpf^{c,p)  can 
not  change  signs  for  any  p  G  B{po,3p)  if  As  <  n  <  ne{c,  Ke,po).  Consequently,  from 
(C.36)  we  have  that: 

sgn{Dpr{c,p)}  =  (7„(c,po) 

for  any  Ai  <  n  <  ne(c,  Ke,po)  if  p  G  B{po,6p)  and  \f{c,p)  -  f{c,po)\  <  e  for  1  <  z  < 
n  —  1 .  In  this  case: 

sgn{r{c,p)  -  r{c,po)}  =  <Tnic,Po)sgn{p  -  po}-  (C.37) 
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Now  suppose  that  p  <  po-  Then  from  (C.37)  if  crji(c,po)  =  1,  tlien  f‘^{c,p)  <  /"'(c,po) 
and  if  (T„(c,po)  =  —1,  then  f‘^{c,p)  >  f^(c,po)  for  any  p  €  B{po,  8p)  such  that  |/®(c,p)  — 
/*(c,po)|  <  e  for  1  <  i  <  n  —  1,  provided  that  N\  <  n  <  ne(c,  A'e,po)-  Combining  this 
result  with  lemma  C.2.5  we  find  that: 

sup  P{x,p)  <  /”(c,po)  if  o-n(c,Po)  =  1 

x€Vn{p,c) 

inf  r{x,p)  >  r{c,po)  if  cr„(c,po)  =  -1 

^GVn(P,c) 

which  implies  that 

inf  \rix,p)  -  /”(c,po)l  >  |/"(c,p)  -  r(c,po)|  (C.38) 

xev„[p,e) 

for  any  p  €  [po  —  ^P5Po]5  if  A^i  <  n  <  ne{c,Ke,po)  (where  14(p,  e)  is  as  defined  in  the 
statement  of  lemma  C.2.5). 

Finally,  from  lemma  C.2.4  we  also  know  that 

max  |/*(c,p)  -/‘(c,po)|  >  min{e,C'V|p-po|}.  (C.39) 

Ni<i<n 

ii  Ni  <  n  <  ne{c,  Ke^po)  and  p  €  B{po,Sp).  Combining  (C.38)  and  (C.39)  we  find  that: 

inf  |/"(a:,p)  -  /"(c,po)|  >  min{e,C'A’|p  -po|}.  (C.40) 

®ev„(p,£) 

if  iVi  <  n  <  ne(c,  Ke,po)  and  p  €  [po  —  ^P,Po]-  Clearly  the  orbit  {/^(c,po)},^o  cannot  be 
e— shadowed  by  an  orbit  of  fp  if 

inf  l/"(x.p)  -  r(c,Po)l  >  e  (C.41) 

a?€V^(p,c) 

for  any  finite  value  of  n.  Consequently  from  (C.40)  and  (C.41)  we  see  that  for  any  e  <  eo, 
the  orbit,  {/*(c,po)},~05  cannot  be  e-shadowed  by  fp  if 

b  -  Pol  >  (C.42) 

and  p  G  [po  —  ^P,Po]-  Setting  K'  =  ^,  this  proves  the  theorem. 
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Appendix  D 

Proof  of  theorem  3.3.2 


This  appendix  contains  the  proof  for  theorem  3.3.2.  I  have  made  an  effort  to  make  the 
appendix  as  self-contained  as  possible,  so  that  the  reader  should  be  able  to  find  most  of 
the  relevant  definitions  and  explanations  in  this  appendix.  Naturally,  this  means  that 
the  appendix  repeats  some  material  found  elsewhere  in  this  report. 


D.l  Definitions  and  statement  of  theorem 


Definition:  Suppose  that  g  :  I  I  is  and  I  C  R.  Then  the  Schwarzian  derivative, 
Sg,  oi  g  is  given  by  the  following: 


Sg{x) 


g'"{x)  3  g"{x) 

g'{x)  2^g'{x)’' 


where  g'{x),g"{x),g'"{x)  here  indicate  the  first,  second,  and  third  derivatives  of  x. 

In  this  section  we  will  primarily  restrict  ourselves  to  mappings  with  the  following 
properties: 


(AO)  g  :  I  I,is  C^{I)  where  I  =  [0, 1],  with  5r(0)  =  0  and  £r(l)  =  0. 

(Al)  g  has  one  local  maximum  at  a:  =  c;  is  strictly  increasing  on  [0,c]  and  strictly 
decreasing  on  [c,  1]; 

(A2)  g"{c)  <  0,  \g'{0)\  >  1. 

(A3)  The  Schwarzian  derivative  of  g  is  negative,  Sg{x)  <  0,  over  all  x  €  /  (we  allow 
Sg{x)  =  -oo). 

Under  the  Collet-Eckmann  conditions,  there  exist  constants  Ke  >  0  and  A^;  >  1 
such  that  for  some  c  ^  C{g): 
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(CEl)  |Ds”(s(c))|  >  A-eAJ 
(CE2)  \Dg'-(z)\>KE\l\tg’'(z)  =  c. 

for  any  n  >  0. 

We  will  be  investigating  one-parameter  families  of  mappings,  f  :  x  Ip  ^  /x, 

where  p  is  the  parameter  and  Ix,  Ip  CM.  are  closed  intervals.  Let  fp{x)  =  f{x,p)  where 
fp  •  Ix  Ix-  We  are  primarily  be  interested  in  one-parameter  families  of  maps  with  the 
following  characteristics: 

(BO)  For  eachp  €  /p,  fp'-Ix^  Ix  satisfies  (AO),  (Al),  (A2),  and  (A3)  where  Ix  =  [0, 1]. 
For  each  p,  we  also  require  that  fp  has  a  turning  point  at  c,  where  c  is  constant 
with  respect  to  p. 

(Bl)  f  :  Ix  y.  Ip  Ix  is  for  all  {x,p)  G  Ix  y  Ip- 

Another  concept  we  shall  need  is  that  of  the  kneading  invariant.  Kneading  invariants 
and  many  associated  topics  are  discussed  in  Milnor  and  Thurston  [34]. 

Definition:  If  /  is  a  piecewise  monotone  map  with  exactly  one  turning  point 

at  c,  then  the  kneading  invariant,  D(g,t),  of  g  is  defined  as  follows: 

II{9i  t)  =  1  +  ^1(5^)^  +  d2{g)t  -f  . . .  -f  9n{g)t‘^  +  -  -  - 


where 

^n{g)  =  ■  en{g) 

en{g)  =  lini  sgn{Dg{g^{x))) 

for  n  >  1.  If  c  is  a  relative  maximum  of  g,  then  one  interpretation  of  6n{g)  is  that  it 
represents  whether  g'^^^  has  a  relative  maximum  (9n(g)  =  +1)  or  minimum  (^n(fi')  =  — 1) 
at  c. 

We  can  also  order  these  kneading  invariants  in  the  following  way.  We  will  say  that 
\D{g,t)\  <  \D{h,t)\  if  6i{g)  —  6i{h),  for  1  <  f  <  n,  but  9n{g)  <  9n{h).  A  kneading 
invariant,  D(fp,t),  is  said  to  be  monotonically  decreasing  with  respect  to  p  if  pi  >  po 
implies  \D{fp„t)\  <  \D{fp^,t)\. 

We  are  now  ready  to  state  the  main  result  of  this  appendix: 

Theorem  3.3.2  Let  {fp  :  Ix  — >  Ix\p  £  Ip}  a  one-parameter  family  of  mappings 
satisfying  (BO)  and  (Bl).  Suppose  that  po  G  int{Ip)  ^  such  that  fp,,  satisfies  (CEl). 

^Henceforth,  if  C  M,  let  int(A)  denote  the  interior  of  A. 
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Also,  suppose  that  the  kneading  invariant,  D{fp,t),  is  monotonically  decreasing  with 
respect  to  p  in  some  neighborhood  of  p  =  po-  Then  there  exists  Sp  >  0  and  C  >  0  such 
that  for  every  Xo  G  4  there  is  a  set,  ^^(rco)  C  4  x  Ip,  satisfying  the  following  conditions: 

(1)  W{xo)  =  l3xo{i))\i  e  [0, 1]}  where  a,,,  :  [0, 1]  4  and  :  [0, 1]  Ip  are 

continuous  and  ^xo{t)  is  monotonically  increasing  with  respect  to  t  with  ^io(O)  =  po 
and  ^xoi^)  =  Po  +  ^P- 

(2)  For  any  xq  €  4,  if{x,p)  6  W{xf)  then  ir(x,p)  - /"(x^po)!  <  C{p-po)^  for  all 
n  >  0. 

D.2  Tools  for  maps  with  negative  Schwarzian  deriva¬ 
tive 

There  has  been  a  significant  amount  of  interest  in  recent  years  into  one- dimensional 
maps,  particularly  maps  with  negative  Schwarzian  derivative.  Below  we  state  some 
useful  properties  and  analytical  tools  that  have  been  developed  to  analyze  these  maps. 
For  the  most  part,  the  results  are  only  stated  here,  and  references  provided  to  appropriate 
proofs.  We  do  not  attempt  to  trace  the  history  of  the  development  of  these  results. 

The  only  results  in  this  section  that  are  new  are  contained  in  lemmas  D.2.11,  D.2.12, 
and  D.2. 13. 

Lemma  D. 2.1  If  g  satisfies  (AO),  (Al),  and  (A2)  then  there  exist  constants  Kq  >  0, 
and  4]  >  0  such  that  for  all  x  e  I  '■ 

(1)  KQ\x-c\<\Dg{x)\<Kx\x-c\ 

(2)  \Ko\x  -  cp  <  \g{x)  -  g{c)\  <  \Ki\x  - 

Proof:  This  is  clear,  since  g"{c)  ^  0. 

Lemma  D.2. 2  If  f{x,p)  satisfies  (BO)  and  (Bl),  then  there  exist  constants  Kq  >  0, 
and  Kx>0  such  that  for  any  x  e  Ix,  y  ^  4,  Po  €  Ip,  and  pi  E  Ip  : 

(1)  \Dxf{x,po)  -  Dxf{y,Po)\  <  K(i\x  -  y\ 

(2)  \Dxf{x,Po)  -  Dxf{x,px)\  <  Kx\pq-Pi\ 

Proof:  This  is  clear,  since  f{x,p)  is  and  4  x  Ip  is  compact. 
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Lemma  D.2.3  (Minimum  Principle).  Suppose  that  g  has  negative  Schwarzian  deriva¬ 
tive.  Let  J  =  [a;o,  xi]  he  an  interval  on  which  g  is  monotone.  Then 

\Dg{x)\  >  min{\Df{xo%\Df{xi)\) 

for  all  X  £  J. 

Proof:  See,  for  example,  page  154  of  [33]. 

Definition:  Given  map  g  :  I  ^  I,  we  say  that  x  is  in  the  basin  of  attraction  of  an  orbit, 
9  if  there  exists  an  m  >  0  such  that  limi^ooig^'^^ {^)  —  Vi)  =  0. 

Lemma  D.2.4  (Singer)  If  g  :  I  I  is  and  has  negative  Schwarzian  derivative, 
then  the  basin  of  attraction  of  any  stable  periodic  orbit  contains  either  a  critical  point 
or  one  of  the  boundary  points  of  I. 

Proof:  See  Singer  [58]. 

Definition  D.2.1  We  will  say  that  a  piecewise  monotone  map,  g  :  I  I,  has  a  sink 
if  there  exists  an  interval  J  C  I  such  that  that  g  is  monotone  on  J”  and  g^{J)  C  J  for 
some  n  >  0. 

Lemma  D.2.5  If  g  :  I  I  satisfies  (AO),  (Al),  (A2),  (AS),  and  (CEl).  Then  g  has 
no  sinks. 

Proof:  It  is  relatively  simple  to  show  that  the  existence  of  such  a  sink  implies  the 
existence  of  a  stable  periodic  point  (see  for  example  Collet  and  Eckmann  [14],  lemma 
II.5.1).  From  Singer’s  theorem,  we  know  that  g  :  [0, 1]  — >  [0, 1]  does  not  have  a  stable 
periodic  orbit  unless  a:  =  0,  x  =  c,  or  x  =  1  is  in  the  basin  of  attraction  of  that  periodic 
orbit.  From  (CEl)  we  know  that  the  critical  point  does  not  tend  to  a  stable  orbit  and 
from  (A2)  we  know  that  x  =  0  and  x  =  1  do  not  tend  to  a  stable  periodic  orbit.  Thus 
g  has  no  sinks. 


Lemma  D.2.6  (Koebe  Inequality).  Suppose  that  g  :  I  I  has  negative  Schwarzian 
derivative.  Let  T  =  [a,  b]  be  an  interval  on  which  g  is  a  diffeomorphism.  Given  x  £  T, 
let  L  and  R  he  the  components  ofT\  {x}.  If  there  exists  r  >  0  such  that: 

]£M>. 

Is(r)l  -  ls{r)l  - 

then  there  exists  K{t)  >  0  such  that: 

\Dg{x)\  >  K{t) sup  \Dg{z)\ 

zeT 


where  K{t)  depends  only  on  r. 


151 


Proof:  See,  for  example,  theorem  3.2  in  van  Strien  [60]. 

Lemma  D.2.7  Let  g  :  I  I  satisfy  (AO),  (Al),  (AS),  (AS)  and  (CEl).  Then  g 
satisfies  (CE2). 

Proof:  See  Nowicki  [44]. 

Lemma  D.2.8  Let  g  :  I I  satisfy  (AO),  (Al),  (A2),  (AS)  and  (CEl).  There  exists 
K  >0  and  Xi>l  such  that  for  any  n  >  0,  if  g'^ix)  =  c  then  [ar  -  c|  >  if  Aj  ”. 

Proof:  From  lemma  D. 2.1,  we  know  there  exists  Kq  >  0  such  that  |iA^(a;)|  < 
for  any  x  e  I.  Now  set  a  -  sup^^j  \Dg{x)\.  Then  we  have: 

\Dg^{x)\  <a"-^A'ok-cl 

However,  by  lemma  D.2.7,  we  also  know  that  g  satisfies  (CE2),  so  that  Dg  (x) 
KeX^  for  some  constants  Ke  >  0  and  A  >  1.  Thus  o”  ^Kq\x  c]  <  KeX  which  implies 
that  I®  -  c|  <  This  proves  the  lemma  if  we  set  K  =  and  Aj  =  (^). 

Lemma  D.2.9  Let  g  :  I  I  satisfy  (AO),  (Al),  (A2),  (AS)  and  (CEl).  Let  C  of  I 
be  any  interval  such  that  g'^  is  monotone  on  J^.  Then  there  exist  constants  K  >  0  and 
A2  >  1  such  that  for  any  n  >  0; 

|Jn|  <  KXr 


Proof:  See  Nowicki  [44]. 

Lemma  D.2.10  Let  g  :  I I  satisfy  (AO),  (Al),  (A2),  (AS)  and  (CEl).  Suppose  that 
is  monotone  on  J  =  [a,  6]  where  J  d  I  and  g^{a)  =  c  for  some  n  >  0.  Then  there 
exist  a  constant,  K  >  0,  such  that  for  any  n  >  0; 

W)\  .  ^ 


Proof:  See  lemma  6.2  in  Nowicki  [45]. 

Lemma  D.2.11  Suppose  that  g  :  I  ^  I  satisfies  (AO),  (Al),  (A2),  (AS),  and  (CEl). 
Let  X  €  I  such  that  \g\x)  -  c\  >  e  for  0  <  i  <  n.  Then,  for  any  e  >  0  there  exist 
constants  C  >0  and  A  >  1  (independent  of  x)  such  that: 

\Dg\x)\  >  Ce^A* 


for  0  <  i  <  n. 
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Proof:  For  any  i  >  0,  let  A, (a:)  be  the  maximal  interval  such  that  x  G  A, (a:)  and  is 
monotone  on  Aj(a:).  The  proof  of  the  lemma  is  based  on  the  following  claim; 

Claim:  Let  a:  G  /,  and  suppose  that  there  exists  b  G  A„(a:)  such  that  g'^{b)  =  c  for  some 
n  >  0.  If  \g'‘{x)  —  c|  >  e  for  0  <  i  <  n,  then  there  exist  Co  >  0  and  A  >  1  (independent 
of  x)  such  that: 

\Dg^'^\x)\  >  Coe'^X^+\ 


We  shall  now  describe  the  proof  of  the  lemma  using  this  claim,  leaving  the  proof  of 
the  claim  for  later. 

Fix'  X  G  /  and  i  <  n.  Suppose  that  A,-(x)  =  [a^a']  and  let  x,-  =  /*(x),  ai  =  /*(a), 
and  a'i  =  f'{a!).  For  definiteness,  assume  that  |xj-  —  a, I  <  |a-  —  x,|  (the  other  case  is 
analogous).  Since  Ai(x)  is  maximal,  each  endpoint  of  A,(x)  must  map  either  into  (1) 
the  critical  point,  or  (2)  into  the  boundary  of  /.  If  case  (2)  is  true,  there  must  exists 
k  <  i  such  that  g^{a)  =  0,  or  g^{a)  =  1  (since  I  =  [0, 1]  by  (A2)).  This  means  either 
a  =  0,  a  =  1  or  g^{a)  =  c  for  some  j  <  k.  If  g^{a)  =  c  then  case  (1)  is  also  satisfied. 
Otherwise,  if  a  =  0  or  a  =  1,  then  /®(Ai(x))  H  {c}  ^  0,  and  the  lemma  may  be  proved 
by  a  direct  application  of  the  claim  described  above. 

Otherwise,  if  case  (1)  is  true,  there  must  exist  k  <i  such  that  g'^{a)  =  c.  By  (CEl), 
we  know  there  exist  constants,  Ke  >  0  and  Ag  >  1  (independent  of  i  and  k)  such  that: 

\Dr’‘-\g‘‘*\a))\  >  KEye'-"  (D.l) 

Now  set  y  G  [a,  o']  so  that  yi  =  g\y)  =  |(ai  +  a'f).  By  the  Koebe  Inequality,  since 
\yk  —  <  \a']^  —  yk\i  there  exists  Kq  =  K{t  =  |)  >  0  such  that: 

Combining  this  with  (D.l)  we  have: 

>  KoKeX^^-^  (d.2) 

Also,  since  |x,-  —  a,!  <  |a(  —  x,|,  we  know  x,-  G  [o.i\yi]  (where  [a;  6]  means  either  [a, 6]  or 
[6,  a]  whichever  is  appropriate).  Thus  by  using  the  minimum  principle  with  (D.l)  and 
(D.2)  we  find  that  there  exists  Ki>  Q  such  that: 

(D.3) 

We  are  now  ready  to  apply  the  claim.  It  is  clear  that  a  G  A/t(x).  Since  g^{a)  =  c, 
the  claim  implies  that  there  exists  Co  >  0,  and  Aq  >  1  such  that: 

|D/+'(x)|  >(7oe2A^+'  (D.4) 
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Combining  (D.3)  and  (D.4)  we  find  that  there  exists  C  >  0,  and  A  >  1  such  that: 

|Ds'(x)|  =  >  Ce"A<. 

This  proves  the  lemma,  except  for  the  proof  of  the  claim,  which  we  describe  below. 

Proof  of  Claim:  Let  A„(a;)  =  [a,  a'].  If  6  =  a  or  6  =  a'  then  the  proof  is  trivial  since  g 
satisfies  (CE2)  from  lemma  D.2.7.  So  suppose  that  b  6  (a,  o').  For  definiteness  suppose 
that  X  <  b  so  that  x  €  [a,  6]  (the  other  ceise  is  analogous).  As  before,  since  A„(x)  is 
maximal,  the  endpoints  of  An(x)  must  map  either  into  the  critical  point,  or  into  the 
boundary  of  I.  Let  us  address  the  critical  point  case  now,  and  come  back  to  the  other 
case  at  the  end  of  the  proof. 

Assume  that  there  exists  k  <  n  such  that  g’^ia)  =  c.  Let  and  bk  =  g’^fb) 

and  let  y  £  [a,  b]  such  that  yu  =  g'‘{y)  =  \{ak  +  bk).  By  the  Koebe  Inequality  we  know 
that  there  exists  K2  =  K{t  =  A)  such  that  |D/(j/)|  >  K2\Dg^{a)\.  Also,  since  £r  satisfies 
(CE2),  there  exists  Ke  >  0  and  Ag  >  1  such  that: 

|Z)/(a)|  >  Ke\%.  (D.5) 

Combining  the  last  two  statements,  we  find  that 

\Dg\y)\  >  K^KeXI  (D-6) 


Now  let  y'  £  [a,b]  so  that  y'f,  =  g‘^{y')  =  Ofc  +  ^sgn{bk  —  ak)e)  £  [ak;bk].  Since 
Xk  =  g^{x)  £  [uk]  6fc],  we  know  |xfc  —  ak\  =  [x^  —  c|  >  e.  Consequently  \bk  —  ak\>£  which 
implies  \yk  -  a^l  >  |e.  Thus,  since  \y[  -  ak\  =  |e,  we  have  y'k  £  [ak]  yk]- 


yk-3k+^ 


ak=c 


ajk+e 


yk=^ak+bk) 


bk 


Xk 


must  be  in  this  interval 


Figure  D.l:  The  interval  5*'(A(x))  =  [afc,a;]  and  associated  variables  are  shown.  The  figure 
is  drawn  assuming  that  a',.  >  ak,b  £  (o, a'),  and  that  x  £  [a,b]. 
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Applying  the  minimum  principle  to  this  interval  and  using  (D.5)  and  (D.6),  we  find 
that  there  exists  A's  >  0  such  that: 

|I>5‘(v')i  >  ■K'sAi  (D,7) 

Also,  for  any  e  >  0,  we  know  from  lemma  D.2.1  that  there  exists  >  0  such  that 

>  5*'4£-  (D-8) 

From  (D.7)  and  (D.8)  and  setting  Ks  =  \K3K4,  we  have: 

\Dg'^+-^{y^)\>KseX^E^\  (D.9) 

Also,  since  g'‘{a)  —  c,  from  (CEl)  we  know  that  \Dg^~'^~^{g^'^^{a))\  > 

Since  g"'{b)  =  c,  we  know  from  (CE2)  that  \Dg^~^~^ {g'^'^^ [b))\  >  KeX'^^~^-  Thus,  by 
the  minimum  principle,  \Dg'^~^~^{g^'^^{y'))\  >  KeX^^~^  .  Combining  this  with  (D.9)  we 
find: 

\Dg^{y')\  >  KsKecXI.  (D.IO) 

From  (CE2)  we  also  know  that 

\Dg^{b)\>KEXl.  (D.ll) 

In  addition,  since  \xk  —  flfcl  >  e,  we  know  that  Xk  €  [y'k'ibk]  so  that  x  G  [y',b].  Thus, 
from  the  (D.IO),  (D.ll),  and  the  minimum  principle,  we  can  conclude  that  there  exists 
Ke>  0  such  that: 

\Dg-ix)\  >  KeeX%. 

Finally,  since  |5'”(x)  —  c|  >  e,  we  can  use  lemma  D.2.1  to  bound  \Dg{g'^{x))\  <  ^'46  for 
K4  >  0.  Consequently  there  exists  Ci  >  0  such  that: 

|£ls”+'(i)|  >  C7,eUj  (D.12) 

which  proves  the  claim  for  the  case  where  g’^ia)  =  c  for  some  k  <  n. 

The  other  possibility  is  that  g^{a)  G  Bd{I)  for  some  k  <  n  where  Bd{I)  denotes 
the  boundary  of  I.  But  this  implies  that  either  a  G  Bd{I)  or  possibly  that  g^~^{a)  =  c. 
The  possibility  where  g^~^{a)  =  c  has  already  been  covered  by  the  previous  case.  On 
the  other  hand,  if  a  G  Bd{I)  then  by  (A2)  there  exists  Ao  >  1  such  that  |D^’^(a)l  >  Aq. 
From  (CE2)  we  also  know  that  \Dg‘^{b)\  >  KeX^.  Thus,  by  the  minimum  principle, 
there  exists  A'7  >  0  and  Ai  >  0  such  that  \Dg'^{x)\  >  A7A"  for  any  x  G  [a, 6].  Then, 
since  \g'^{x)  —  c|  >  e  we  can  use  lemma  D.2.1  to  bound  \Dg{g'^{x))\  so  that  there  exists 
(^2  >  0  satisfying: 

|Dp"+'(x)|>C'2eA^  (D.13) 

Combining  (D.12)  and  (D.13)  shows  that  we  can  pick  (7  >  0  and  A  >  1  to  prove  the 
claim. 
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Lemma  D.2.12  Let  g  :  I  I  satisfy  (AO),  (Al),  (A2),  (AS),  and  (CEl).  Suppose 
there  exists  a  €.  I  and  n  >  0  such  that  =  c.  Given  any  o;  >  0  small, 

either  mino<i<„  |y*(a)  -  c\  >  a  or  there  exists  b  e  I,  n'  >  0,  and  constants  K  >  0  and 
K'  >  0  such  that  =  c,  |fe  —  a|  <  Ka,  and  n'  <  n  —  K'log 

Proof:  Suppose  that  mino<i<n  Wia)  -  c\  <  a.  Then  there  exists  m  <  n  such  that 
\g'^{a)  -  c|  <  o;  and  \g'{a)  -  c|  >  a  for  0  <  z  <  m. 

Since  y’"(?/o)  approaches  close  to  c,  we  can  bound  m  away  from  n  using  lemma  D.2.8: 

n-m>  ^  (D.U) 

log  Al 

where  Ai  >  1  is  a  constant  dependent  only  on  g. 

We  now  consider  two  possibilities:  (1)  there  exists  b  G  I  such  that  g™'{b)  =  c  and  g 
is  monotone  on  [<i;  b]  or  (2)  there  exists  6  £  /  and  k  <.  m  such  that  g'^  is  monotone  on 
[a;  6],  /(6)  =  c,  and  €  [g'^{a)\c].  One  of  these  two  cases  must  be  true. 

Let  a,  =  g\a)  and  bi  =  y'(6)  for  z  >  0.  In  the  first  case,  from  lemma  D.2.10,  there 
exists  A'a  >  0  such  that: 

|6-<,|<^|6„-c|<^.  (D.15) 

Also,  from  (D.14)  we  know  m  <  n  -  Thus,  in  this  case  the  lemma  is  proved  if  we 

set  K  =  ■^,  K'  =  ^  and  n'  =  m. 

Now  we  address  the  second  case.  From  lemma  D.2.1  we  know  there  exists  Kq  >  0 
and  >  0  such  that  Ko\x  -  cp  <  \f{x)  -  /(c)|  <  Ki\x  -  cp.  Thus  if  we  set  K2  =  ^ 
we  see  that  for  any  6^  >  0  and  6*  >  we  have  that: 

y([c±^;c])  C  y([c;c±r])  (D.16) 

where  the  ±  notation  means  that  the  relation  holds  for  all  four  possible  combinations. 
Also  note  that  since  bk  =  c  and  bm  €  [omi  c]  we  have: 

[ofc4.i;6fc+i]  =  gi[0'k]bk])  =  g{[(ik]c]) 

[Om+i;  =  y([Om;  ^m])  c  y([am;  c]).  (D.18) 

We  now  assert  that  \ak—bk\  <  K2OL.  Suppose  to  the  contrary  that  |afc— c|  =  \ak—bk\  > 
K20i  >  K2\am  —  c|.  Then,  combining  this  with  (D.16),  (D.17),  and  (D.18)  implies  that: 

^  (D.19) 
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However,  since  g  satisfies  (CEl),  it  cannot  have  any  sinks  (from  lemma  D.2.5).  In 
particular  this  means: 

if  A:  <  m  since  is  monotone  on  [a;  &]  if  a  >  0  is  sufficiently  small.  Thus,  (D.19) 
cannot  be  true  so  we  conclude  that: 

\ak  -  h\  <  K20C. 

Finally,  since  hk  =  c,  we  can  use  D.2.10  to  show  that  there  exists  Kz>  Q  such  that: 

\b-a\<-^\at-bi:\  =  ^K,a  (D.20) 

rta  A3 

Thus  combining  (D.14)  and  (D.20)  we  see  that  the  lemma  is  satisfied  if  we  set  K  = 

K'  =  and  n'  =  k<m<n  — 

Thus,  combining  the  results  from  (D.15)  and  (D.20),  proves  the  lemma. 

Lemma  D. 2. 13  Suppose  g  :  I  I  satisfies  (AO),  (Al),  (A2),  (AS),  and  (CEl). 
Then  there  exists  C  >  0  and  Cq  >  0  so  that  given  any  positive  e  <  to,  and  any  x  6 
I  such  that  X  +  t  €  I,  then  there  is  a  y  €  (x,  x  +  e)  such  that  N{y,g)  <  00  and 
mino<t<iv(2/,5)  \9*{y)  —  c|  >  Ce.  Similarly  if  x  —  e£  I,  then  there  exists  y'  G  (x  —  t,  x)  such 
that  N{y',g)  <  00  and  mino<j<Ar(j,',5)  W{y)  -c\>  Ce. 

Proof:  We  show  the  proof  for  y  G  (x,x  +  e).  The  proof  for  y'  G  {x  —  e,  x)  is  exactly 
analagous. 

Our  plan  is  to  apply  lemma  D.2.12  as  many  times  as  necessary  to  find  an  appropriate 
y  to  satisfy  the  lemma.  In  other  words,  lemma  D.2.12  implies  that  given  any  yi  G  I  such 
that  Ui  =  N{yi,g)  <  00  and  mino<,<n;  |5'*(yt')  —  c\>  a,  then  there  exists  a  6  I  such 
that  |t/,+i  —  yi\  <  Ka  and 

n,+i  =  N{yi^i,g)  <  Ui  -  K'-  (D.21) 

a 

for  positive  constants  K  and  K' .  Thus  given  yo,  we  can  generate  a  sequence  {yiY^  in 
this  manner  for  increasing  i  until  i  =  m  such  that 

min  Is'Xym)  -c\>a.  (D.22) 

0^4^71 771 

For  example,  given  any  a  >  0,  and  any  xq  G  I  we  know  from  lemma  D.2.9  that  if 
xq  +  a  G  I,  then  there  exists  yo  G  (xo,  xq  +  a)  such  that  g‘^°{yo)  =  c  for  some  integer 
satisfying: 

no  <  +  1  (D.23) 

logA2 
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where  A2  >  0  is  a  constant  dependent  only  on  g.  If  we  generate  from  the  yo 

specified  above,  then  from  (D.21)  and  (D.23)  we  find  that: 

m  <  (r^  -  i/i:0(log  -)  +  1  (D-24) 

log  A2  ct 

for  all  0  <  i  <  m.  Set  M  =  +  1-  Then  for  sufficiently  small  o:  >  0  we  find  that 

m  <  M  because  otherwise  (D.24)  would  imply  that  Uj  <  0  for  i  >  m. 

So  given  a:  €  /  and  positive  c  <  cq  from  the  statement  of  the  lemma,  set  xq 
X  +  KMa  and  o;  =  2KM+1  choose  eo  >  0  to  insure  that  or  >  0  is 

sufficiently  small  so  that  the  above  arguments  work.  Also,  note  that  since  sq  +  a  = 
X  +  +\  ^  <a;  +  e,  ifa:  +  eG  /  then  xo  +  a  e  I.  From  our  choice  of  yo  E  {xo,  xq  +  a), 

we  also  know  that  since  \yi+i  —  yi\  <  Ka,  we  have  \ym  —  Vo]  <  Kma.  Consequently 
ym  >  x  +  KMa  —  Kma  >  x  and  y^  >  x  +  KMa  +  a  +  Kma  >  x  +  {2KM  +  l)a  <  x  +  e. 
Thus  ym  €  {x,x  +  e)  and  from  (D.22),  we  have  that  mino<i<n,„  Ig'iVm)  -  c\  >  a  =  Ce 
where  C  =  2KM+\-  Setting  y  =  ym,  this  proves  the  lemma. 


D.3  Analyzing  preimages 

In  this  section  we  will  investigate  one-parameter  family  of  mappings,  {fp\p  G  Ip},  that 
satisfy  (BO)  and  (Bl).  Our  discussion  depends  on  an  examination  of  the  preimages  of 
the  critical  point,  x  =  c  in  4  x  /p  space.  We  first  need  to  introduce  some  notation  in 
order  to  describe  the  relevant  concepts. 

For  the  remainder  of  this  section,  {fp\p  G  Ip}  will  refer  to  a  given  one-parameter 
family  of  mappings  satisfying  (BO)  and  (Bl).  We  will  consider  the  set  of  preimages, 
P{n)  E  hx  Ip  satisfying: 

P(n)  =  {{x,p)\f{x,p)  =  c  for  some  0  <  i  <  n}. 


First  of  all,  it  will  be  useful  to  have  a  way  of  specifying  particular  “sections”  of 
preimages,  R(n,  xo,po),  extending  from  a  particular  point  (xo,po)  G  4  x  Ip.  So  let 
R{n,Xo,Po)  C  4  X  4  denote  the  set  of  path-connected  elements,  consisting  of  all  points 
{x',p')  E  Ix  'X  Ip  such  that  there  exists  a  continuous  function  g  :  Ip  K  satisfying 
g{po)  =  Xo,  g{p')  =  x',  and 

{(x,p)|x  =  g{xo,po)ip)^P  €  boip']}  C  P{n). 

where  \po',p']  may  denote  either  [po,p']  or  \p',po],  whichever  is  appropriate. 

A  roadmap  of  the  development  in  this  section  is  as  follows.  In  lemma  D.3.1  we  show 
that  P{n)  cannot  have  isolated  points  or  curve  segments.  Instead,  each  point  in  P{n) 
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must  be  part  of  a  path-connected  set  of  points  in  P{n)  that  stretches  for  the  length  of  the 
parameter  space,  Ip.  In  lemma  D.3.2  we  demonstrate  that  if  the  kneading  invariant  of  /p, 
D{fp,t),  is  monotonically  decreasing  (or  increasing),  then  P{n)  must  have  a  branching 
tree-like  structure.  As  we  travel  along  one  direction  in  parameter  space,  branches  of  P{n) 
must  either  always  merge  or  always  split  away  from  each  other.  For  example  if  D{fp,t) 
is  monotonically  decreasing,  then  branches  of  P{n)  can  only  split  away  from  each  other 
as  we  increase  the  parameter  p.  Thus  in  this  case,  R(n,y-,po)  and  i?(n,  j/+,po)  cannot 
intersect  each  other  for  p  >  po  if  2/+  7^  J/-,  and  y+,  p_  €  /x- 

In  lemmas  D.3.3,D.3.4,  D.3.5,  and  D.3.6  we  develop  bounds  on  the  derivatives  for 
differentiable  branches  of  R{n,x,po).  The  basic  idea  behind  lemma  D.3.7  is  that  we 
can  use  these  bounds  to  demonstrate  that  for  maps,  /p,  with  kneading  invariants  that 
decrease  monotonically  in  parameter  space,  there  exist  constants  C  >  0  and  ^p  >  0  such 
that  if  xo  G  Ix  and 

t/(p)  =  {x|  |x -xo|  <  C(p-po)3}  (D.25) 

for  any  p  €  Ip,  then  for  any  p'  €  [po,Po  +  ^p],  there  exists  x'_^  6  U(p')  such  that 
{x'^,p')  E  i2(n+,  t/+,po)  for  some  y+  >  Xq  and  n^.  >  0  assuming  that  (i/+,Po)  =  c. 
Likewise  there  exists  x'j^  €  U{p')  such  that  {x'_,p')  €  i?(n_,j/_,po)  for  some  p_  <  xo  and 
n_  >  0  where  /"■-(?/_, po)  =  c. 

However,  setting  n  =  max{n+,n_},  since  i?(n,p_, po)  and  R{n,y+,po)  do  not  inter¬ 
sect  each  other  for  p  >  Po  and  y_  ^  y+,  we  also  know  that  for  any  p_  <  y+,  there  is 
a  region  in  4  x  Ip  space  bounded  by  R{n,y-,po),  /?(n,j/+,po),  and  p  >  po.  Given  any 

€  Ix,  take  the  limit  of  this  region  as  p-  — >  Xq  ,  p+  — >  Xo  >  and  n  — >  oo.  Call  the 
resulting  region  5(xo).  Observe  that  <S'(xo)  is  a  connected  set  that  is  invariant  under  / 
and  is  nonempty  for  every  parameter  value  p  E  Ip  such  that  p  >  po.  Thus  since  ^(xo)  is 
bounded  from  (D.25),  there  exists  a  set  of  points,  5'(xo),  in  combined  state  and  param¬ 
eter  space  that  “shadow”  any  trajectory,  {/po(xo)}^o  /po-  Finally  we  observe  that  a 
subset  of  5(xo)  can  be  represented  by  the  form  given  for  VF(xo). 

We  are  now  ready  to  examine  these  arguments  more  formally. 

Lemma  D.3.1  Let  {fp  :  Ix  Ix\p  €  Ip}  be  a  one-parameter  family  of  mappings  sat¬ 
isfying  (BO)  and  (Bl).  Suppose  that  xq  E  Ix  satisfies  n  =  N{xo,fpo)  <  oo  for  some 
Po  €  int{Ip).  Then  the  following  statements  hold  true: 

(1)  There  exists  a  closed  interval  Jp(xo,po)  C  Ip,  and  a  function  h^xo,po)  ■  Hviy-iPo) 

4  such  that  Po  E  int(Jp{xo,po)),  hy^p„{po)  =  Po,  and  f^ihy,p„{p),p)  =  c  for  all 
P  ^  JpiViPo)-  Also,  if  Jp{y,Po)  =  [o,  &]  then  a  is  either  an  endpoint  of  Ip  or 
P{^y,po  (®))  ®)  =  ^  some  i  <  n,  and  similarly  for  b. 

(2)  There  exists  a  continuous  function,  g(xQ,po)  '•  Ip Ix  such  that  gf^xo,po)iPo)  =  and 

{(x,p)|x  =  P(^o,po)(f)5P  €  Ip}  C  P{n). 
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Proof:  Suppose  that  f^°{xo,po)  =  c  for  mo  <  n  and  f{xo,po)  ^cforO<i<  mo.  Then 
define  the  set  S{xo,po)  Q  Ix  ^  to  be  the  maximal  path-connected  set  satisfying  the 
following  conditions: 


(1)  {xo,po)  €  -S'(xo,Po) 

(2)  {x,p)  €  S{xo,  yo)  if  P  €  /p  and  f{x,p)  ^  c  for  every  0  <  i  <  mo- 

Note  that  iS'(xo,po)  must  contain  an  open  neighborhood  around  (a:o,po)  because  of  the 
continuity  of  /. 

Now  let  5(a:o,Po)}  be  the  closure  of  of  5'(a:o,Po),  define  Q(xo,po)iP)  =  {^K^^P)  ^ 
5'(xo,po)},  and  let 


Jp{xo,po)  =  [  inf  p,  sup  p] 
(x,p)€S{xo,po)  (x,p)eSlxo,po) 


(D.26) 


We  claim  that  Q(xo,yo)iP)  ^  consist  of  a  single  connected  interval  for  every 

p  €  Jp{xo,po).  Otherwise  if  there  existed  xj  <  X2  <  xz  such  that  X\  €  Q(xo,po)(p)’ 
3^2  ^  ^(xo,pp)(p)>  and  ss  €  Q{xo,pM  then  there  would  exist  i  <  mo  such  that  c  € 
in^Or py/n^SiP)]-  But  since  (a:i,p)  €  S{xo,po)  and  {xz,p)  €  5(a:o,po)  there  exists  a 
connected  path,  {(x(t),p(t))|t  €  [0,1]}  C  jS'(a:o,Po))  joining  (2:1, p)  and  {xz,p),  where 
where  x{t)  :  [0, 1]  ^  4  and  p{t)  :  [0, 1]  Ip  are  continuous  functions.  Along  this  path, 
P{x{t),p{t))  is  continuous  and  r{x{t),p{t))  ^  c  for  any  t  G  [0,1].  This  contradicts  the 
assertion  that  c  G  [/’(^o?p)i /‘(®3?p)]  ^-nd  proves  the  claim  that  Q{xo,yo){p)  naust  consist 
of  a  single  interval  for  all  p  G  Jp{xo,po)- 

Returning  to  the  proof  of  the  lemma  we  find  that,  since  (x,p)  G  ^'(xojPo)  implies 
4(2^5 p)  ^  0  for  every  0  ^  i  <!  mo,  we  know  that  f^^{x)  must  be  strictly  monotonic 
on  Q(xo,yo)(p)  ®ach  p  G  Jp(xo,Po)-  Thus  for  each  p  G  [pojPi)  there  is  exactly  one 
X  6  Q(xo,yo)(p)  such  that  f”'°(x,p)  =  c.  Consequently  there  exists  a  function  /i(aro,po)  = 

Ip  //such  that  /’"‘’(/i(xo,po)(pip)  =  c  h^o,Po){p)  ^  Q(^o,yo){p)  P  ^  4(^0, Po). 
Furthermore,  the  function,  ^(xo,po)?  must  be  for  p  G  i'nt{Jp{xQ,po))  since  f{x,p)  is 

and  /p™°(a;)  is  strictly  monotonic  in  for  a:  G  Q(xo,yo)(p)-  Finally,  from  our  choice  of 
5'(xo,Po)  and  /i(xo,po)(p)>  is  clear  that  {/i(xo.po)(p)iP)  ^  F(n)  for  all  p  G  Jp{xo,Po)-  This 
proves  property  (1)  of  the  lemma. 

We  now  have  to  construct  a  continuous  P(xo,po)(p)  ^Fat  is  valid  over  the  entire  range 
of  Ip.  Suppose  that  Jp(xo,j/o)  =  [P-i^Pi]-  Let  P(xo,po)(Pi)  From  our  specification 

of  ^(xcPo)  it  is  clear  that  P{xi,pi)  =  c  for  some  j  <  mo.  Thus  there  exists  mj  < 
mo  such  that  f^^{xi,pi)  =  c  and  f{xi,pi)  c  for  0  <  *  <  m^  Consequently,  we 
can  use  the  same  arguments  as  before  to  consider  the  set  5(xi,pi),  and  generate  a 
continuous  function,  ^(xi,pi)(p)  such  that  {h^xi,pi)ip)iP)  ^  ■F(n.)  for  all  p  G  Jp(xi,pi) 
where  Jp(xi,  j/i)  D  [pi,  P2]  for  some  P2  >  Pi  -  This  argument  can  be  carried  out  repeatedly 
for  mo  >  mi  >  m2,...  and  so  forth.  However,  since  /™’(xi,pi)  =  c,  we  see  that 
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sup(/p)  G  Jp{xi,pi)  for  some  i  <  n.  Similarly  we  can  also  use  the  same  arguments  for 
p  <  Po,  working  in  the  opposite  direction  in  parameter  space  in  order  to  successively 
generate  {h(^x-i,p-i){p)iP)  €  -P(«)  for  increasing  values  of  i.  Consequently,  there  exists 
—n  <  a  <0  and  0  <  6  <  n  such  that  Ip  =  Now  if  we  set  h  :  Ip  Ix  to  be 

9(xo,po){p)  —  ^(xi,Pi)(p)  if  P  €  Jp{xi,pi),  (D.27) 

we  can  see  that  g{xo,po)ip)  continuous  since  /i(j,.^p.)(p)  is  p  G  int{Jp{x{,pi)),  and 
h<^i,Pi){Pi)  =  ^(xi_i,p._i)(Pi)  for  all  a  <i  <h.  Finally,  since  (/i(xi,p,)(p):P)  ^  all 

a  <  i  <  b  we  see  that  g(xo,po)ip)  properties  guaranteed  by  the  lemma. 

Lemma  D.3.2  Let  {fp  :  Ix  Ix\p  €  Ip}  be  a  one-parameter  family  of  mappings  satis¬ 
fying  (BO)  and  (Bl).  Suppose  that  there  exists  6p  >  0  such  that  the  kneading  invariant 
D{fp,t)  is  monotonically  decreasing  for  p  £  [po,po  +  ^p]-  Then 

R{n,yo,po)  C  R{n,yi,po)  fl  (4  x  [po,po  +  ^p])  =  0  (D.28) 

for  any  yo  ^  yi  and  any  n  >  0  such  that  yo  £  4  and  y\  £  Ix- 

Proof:  Suppose  that  there  exists  yo  £  Ix  and  yi  £  Ix  such  that 

R{n,yQ,Po)  n  R{n,yi,po)  n  (4  x  [po,Po  +  ^p])  ^  0-  (D.29) 

for  some  n  >  0  where  N{yo^fpa)  <  n  and  iV(j/i,/po)  <  n.  It  is  sufficient  to  show  that 
this  statement  contradicts  the  condition  that  D{fp,t)  is  monotonically  decreasing  for 

P^\po,Po  +  Sp]. 

Let  p'  >  Po  be  the  smallest  value  such  that  there  exists  a  pair  of  points  j/2  £  4  and 
ys  €  4  with  2/2  <  Vs  satisfying: 

i?(n,  2/2,  Po)n/2(n,  2/3,  po)n  (4  X  [po,p'])  7^  0-  (D.30) 

Assuming  that  {D.29)  is  true,  we  know  that  p'  <  po  +  bp.  Now  fix  2/2  in  the  right 
hand  side  of  (D.30)  and  let  2/3  take  on  all  values  such  that  2/3  >  J/2  and  2/3  G  4-  Let 
2/4  be  the  smallest  possible  value  of  y^  that  satisfies  (D.30)  and  set  x'  £  Ix  such  that 
{x',p')  £  i?(n,2/2,Po)  and  (x',p')  G  /?(n,2/4,Po)- 

Let  G2  be  the  set  of  all  continuous  functions,  g2  ■  Ip  Ix,  such  that  g2{p')  =  x'  and 
/(p2(p),p)  €  R{n,y2,po)  for  all  p  G  Ip.  By  lemma  D. 3.1,  there  exist  at  least  one  element 
in  (j2.  Set 

P2(p)  =  sup  p2(p).  (D.31) 

Clearly  g2{x)  must  be  also  be  continuous  function  that  satisfies  p2 (pO  =  / (P2 (p) ,  p)  £ 

R{n,  2/2,  Po)  for  all  p  >  po  if  p  €  Ip.  Similarly  we  can  define  g4{x)  in  analagous  way,  making 

g4{x)  =  _inf  p4(x)  (D.32) 

fl4€C?4 
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where  G4  is  the  set  of  all  functions  Qa  ■  Ip^  ixi  satisfying  gA{p')  —  ^  f{9‘i{p)iP)  € 

R{n,y4,Po)  for  all  p  satisfying  p  €  /p  and  p  >  po- 

Because  of  our  choice  of  p',  we  know  that  P2(p)  7^  fi'4(p)  if  P  €  [po?P  )•  Now  let 

J2  =  {(/(<?2(p),p),p)b  e  4} 

J4  =  {(/(i?4(p),p),p)|p  €  Ip}. 

And  let  M  €  /x  x  fp  be  the  interior  of  the  region  bounded  by  J2  U  J4  U  (/^  x  {po})-  From 
our  choice  of  p'  we  know  that 

J2  n  i2(n,p,po)  n  (4  X  [po,p'))  =  0 

J4  n  i?(n,p,po)  n  (4  X  [po,p'))  =  0 

for  any  y  ^  y2  and  j/  ^  j/4.  From  our  choice  of  y4  we  also  know  that  (a;  ,p  )  ^  i2(n,p,po) 
for  any  y  G  (y2,J/4)-  Thus  we  conclude  that  no  R{n,y,po)  intersects  M  for  any  y  £  Ix 
satisfying  y  ^  y2,  P  /  P4,  and  N{y,  fp^)  <  n.  Finally,  from  our  choice  of  of  g2{x)  and  g4{x) 
it  is  also  apparent  that  neither  R{n^y2^Po)  nor  •R(n, y45Po)  intersects  M.  Consequently, 
we  see  that: 

M  n  P{n)  =  0.  (D.33) 


Now  let 

Mxip)  =  {a:l(a:,p)  €  M} 

where  M  denotes  the  closure  of  M.  From  (D.33)  we  know  that  /’  is  strictly  monotonic 
on  Mxip)  for  any  0  i  ^  Note  in  particular  that  this  implies  that  there  can  exist  no 
0  <  i  <  n  such  that 

g\{p)  =  9\{p)  =  o  (D.34) 

for  any  p  G  [po,p'). 

Now  let  {ofcjfeo  be  a  monotonically  increasing  sequence  such  that  oo  =  Po  and 
ai^  p'  as  k  —*  00.  We  know  that  for  any  p  G  [pojP^])  there  exists  an  fc  <  n  such  that 
f^{g2{p),p)  =  c.  Thus  consider  the  sequence  {6fc}^o  where  bk  =  N{g2{ak),  fak)-  Since  bk 
can  only  take  on  a  finite  number  of  values  (0  <  <  n),  we  know  there  exists  an  infinite 

subsequence  {ki}^o  such  that  6fc;  =  6  if  i  >  0  for  some  0  <  6  <  n.  This  implies  that 
f{g2{aki),  aki)  =  c  for  all  i  >  0.  Also,  since  /  is  continuous  and  at;  ->  p'  as  i  00,  we 
can  also  conclude  that 

f{g2{p'),p')  =  f{x\p')  =  c.  (D.35) 

We  also  play  the  same  game  with  g4  instead  of  g2.  Consider  the  sequence  {di}j_o 
where  di  =  ^(54(0^,),  4^.).  We  know  that  di  can  only  take  on  a  finite  number  of  values. 
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so  there  exists  an  infinite  subsequence,  ^  number  0  <  <  n  such  that  di.  =  d 

for  all  j  >  0.  In  this  case,  f‘^{g2{ci‘ki  .),(iki  )  =  c  for  all  j  >  0.  Since  aj^.  ^  p'  as  ji  — >  oo 
this  implies  that 

f’‘(94{p'),p')  =  fV,p')  =  c.  (D.36) 


However,  from  (D.34)  we  also  know  that  di  ^  bk^  for  all  z  >  0.  Thus  d  ^  b.  For 
definiteness  assume  b  <  d.  There  exists  8p\  >  0  such  that  if  p  G  [p'  —  ^Pi,p']  then 
5^2 (p)  7^  whenever  g^ip')  c  for  any  i  satisfying  b  <  i  <  d.  Choose  p*  =  ak^.  for  some 

J  >  0  large  enough  such  that  p*  >  p'  —  8pi.  Note  that  by  this  choice  of  p*,  we  know  that 

f{92{p*),P*)  =  c  and  f{g4{p*),p*)  =  c. 

Now  recall  the  definition  of  the  kneading  invariant: 

OO 

c(/„t)  =  i  +  EM/pK- 

»=1 

where 

Oiifp)  =  ei(/p)e2(/p)..-ei(/p) 

€i{fp)  =  lim  5pn(D/(/(c,p))) 

a?— ♦cT 

We  claim  that 

|1  +  e'  >  |1  +  ''e'  (d.37) 

«=1  i=l 

If  this  claim  is  true,  the  rest  of  the  lemma  follows.  At  this  point  we  shall  finish  the  proof 
of  the  lemma  before  coming  back  to  the  proof  of  the  claim. 

From  (D.35)  and  (D.36)  we  know  that 

Od-bifp')  =  +1  (D.38) 

Also,  since  g2{p)  P4(p)  for  p  G  [po,?')^  and  /'^(p4(p*),p*)  =  c,  we  know  f^{g2{p*),P*)  = 

f'^~^{c^p*)  ^  c.  Combining  this  result  with  the  fact  that  /p.  is  monotone  on  Mx{p*)  we 
see  that  if  f^~’’{c,p*)  >  c  then  hais  a  maximum  at  a:  =  c,  which  implies  that 
must  have  a  minimum  at  x  =  c.  Otherwise,  if  f’^~’’{c,p*)  <  c  then  has  a  minimum 
at  X  =  c,  and  again  has  a  minimum  at  x  =  c.  Thus  we  conclude  that: 


ed-b{fp')  =  -1.  (D.39) 

Finally,  combining  (D.38)  with  (D.39)  with  the  claim  above  we  find  that  \D{fpi,t)\  > 
\D{fp*,t)\.  But  since p'  >  p*,  this  contradicts  the  assumption  that  the  kneading  invariant 
of  fp  is  monotonically  decreasing  with  respect  to  p.  This  proves  the  theorem,  except  for 
the  proof  of  the  claim  which  we  give  below: 
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We  now  prove  the  claim  given  in  (D.37)  by  induction  on  i.  Suppose  that  6i-i(fpi) 
6i-i{fp*).  We  shall  show  that  Oi{fp')  >  ^i(/p»)- 

Since  f{92{p'),p')  =  f\92{p*),P*)  =  c,  we  can  see  that 

s9n{Df{f{c,p)))  =  S9n{Df{f^'^'{92{p),p),p)) 

for  either  p  =  p'  or  p  =  p*.  Since  R{n,y,pQ)  does  not  cross  the  boundary  of  M  for  any 
y  €  /x,  we  can  see  that  either  both  f^'^*i92{p'),p')  ^  ^  and  f^'^^{92{p*)iP*)  >  c  or  both 
f+^{92{p'),p')  <  c  and  f+^{92ip*),P*)  <  c  since  both  {92{p'),p')  and  {92(9*), P*)  are 
on  the  boundary  of  M.  Furthermore  from  our  choice  of  p*  and  6pi  >  0  we  know  that 
if  5’(c,p0  c  then  ^*(c,p*)  ^  c  for  0  <  i  <  5  -  d.  Consequently  we  can  see  that  if 

9^{ciP')  7^  c  then 

This  in  turn  implies  Oi{fp>)  =  9i(fp*)  since  9i{fp)  =  £t(/p)di-i(/p)-  On  the  other  hand,  if 
g^{c,p')  =  c,  then  Oi{fp')  =  +1  so  we  automatically  know  that  0i{fp')  >  Oi{fp*). 

Finally,  note  that  the  0j(/p')  >  9i{fp*)  i^  satisfied  for  i  —  \  since  we  have  Oi[fpi)  — 
0i(/p.)  from  (D.40)  if  9{c,p')  =  c  and  0i(/pO  >  0i(/p.)  if  ^(c,pO  =  c.  This  completes  the 
proof  of  the  claim. 

Lemma  D.3.3  Let  {fp  ;  4  41p  €  Ip}  be  a  one-parameter  family  of  mappinys  sat- 

isfyiny  (BO)  and  (Bl).  Let  po  G  int{Ip)  and  Mp  =  sup^g/,(T>p/(x,Po))-  Given  Xq  G  4 
such  that  n  =  N{xo,  fpo)  <  00,  then  for  each  p  G  J(a:o,Po)-- 

,  _ ^ 1  I 

l^(:ro,Po)(P)l  ^  \DJ{f^-^{h(,„p,){p),p),p)\  h'o  D.P{h.o,Po){p)^P) 

Proof  In  order  to  prove  the  lemma,  we  first  need  the  following  result  (which  can  be 
found,  for  example,  on  page  417  of  [33]). 

Claim:  For  any  a:  G  4  and  n  >  1  : 

|C,r(i,P)l  <  Mfji\D,r''~‘{f{x,p),p)\  (D-41) 


Proof  of  claim:  Proof  by  mduction  on  n.  For  n  =  1  the  claim  is  clearly  true.  By  the 
chain  rule,  for  any  n  >  1  : 

Dprix,p)  =  Dpf{r-\x,pfp)  +  Dj{r-\x,p),p)Dpr-\x,p) 
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Thus  we  have  the  following 

Pp/"(^.p)i  <  M,  +  \Dj(r-\=‘,?),p)\\Ovr-\x,p)\ 

<  A/p  +  |Dp/(/“-'(x,p),p)|A/p£'  \D,r^-‘U‘(x,p),p)\ 


<  A/p  +  A/pS|Z3./"-'-(f(:r.p).rt| 

1=0 

<  A/p2ic.r"'-‘(fKp),p)i 

i=0 

This  completes  the  induction  argument  and  proves  the  claim. 

Returning  to  the  proof  of  the  lemma,  we  know  that  since  f'^{h(^xo,po){p)^P)  —  ^ 
p  £  J{xo,po).  Consequently 


^[r(^(^o,vo)(p),P)]  =  0 


(D.42) 


By  the  chain  rule: 

^[/”(A(»«)(P).P)1  =  +  0,r(h(„,M<p)  (D.43) 

Thus,  combining  (D.42)  and  {D.43),  we  have: 


,,,  ,  ,,  _  |g,/“(h(  XQtPO  )(P)>P)I 

I^.O.PO)WI  \DxfHh.o,P0){p),P)\ 


(D.44) 


Let  Xp  =  h^xo,po)ip)-  Then,  combining  (D.41)  and  (D.44)  we  have; 

/  Mpi:t=o\Dxr-^-^{nxp,p),p)\ 

-  |D./"(x„p)| 

Mp  ^  \Dxr~'^{f{xp,p),p)\ 


\Dxf^{xp,p)\  ^  \DJ^{p{xp,p),p)\ 

M  ^  1 

“  \Dxf{f^~^{xp,p),p)\  2  |D^/’(/’(a:p,p),p)r 

provided  p  G  J(xo,po)-  This  proves  the  lemma. 

Lemma  D.3.4  Let  {fp  :  —>■  Ix\p  €  fp}  a  one-parameter  family  of  mappings  satis¬ 

fying  (BO)  and  (Bl).  Suppose  that  po  G  int{Ip),  and  fp^  satisfies  (CEl).  Also,  suppose 
that  xq  G  Ix  such  that  n  =  N{xo,fpg)  <  oo,  and  mino<,<n  |/*(xo,po)  —  c\  —  ax^  >  0. 
Then  there  exist  constants  Ci  >  0  (independent  of  xq)  such  that 


\K^o,Po)iPo)\  <  Cl  — 
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Proof.  From  lemma  D.3.3  : 


/  \|  ^ _ ^ ^ _ I _ 

1  (xo,po)yPo)\  -  \DJif^-'^{xo,PQ),Po)\  t'o  \Dxf{xo,Po)\ 


(D.45) 


From  lemma D.2.7,  we'also  know  that  /p^  satisfies  condition  (CE2).  Thus,  since  p{xo,po)  = 
c,  we  know  there  exists  Ke  >  0  such  that  \Dxf{f’^~^{xo,po),po)\  >  Ke-  Substituting 
this  into  (D.45)  we  have: 


M 


n— 1 


|/.Uo)(^o)i<^E 


\DxP{xo,Po)\ 


(D.46) 


From  lemma  (D.2.11)  we  know  that  there  exists  C  >  0  and  A  >  0  such  that: 
Then  from  (D.46), 


W{x)\  >  Cal^y 


\hix„po)iPo)\  -  Ke  ho  ~  KECal^ 

if  we  set  C\  =  This  proves  the  lemma. 


Lemma  D.3.5  Let  {/p  :  4  4b  €  4}  he  a  one-parameter  family  of  mappings  satis¬ 
fying  (BO)  and  (B1 ).  Let  po  €  Ip  and  suppose  that  Xo  €  4  such  that  n  —  iV(a:o,  /po)  ^ 
and  mino<i<n  IPixotPo)  —  c|  =  Ofa^o  >  0.  Then  for  any  0  <  /?  <  1  there  exists  0  <  C2  <  2 
such  that  if  X\  G  4  p\  €  Ip  satisfy: 


(1)  \P\  -  Po\  <  G20Cxo- 

(2)  \f{xi,pi)  -  f{xo,po)\  <  C2axa  for0<i<n 
then 

\Dxf{Xl,Pl)\  ^  gi 

\DxP{xo,Po)\  ~ 

for  0  <  i  <  ra. 

Proof.  Combining  lemmas  D.2.1  and  D.2.2  with  conditions  (1)  and  (2)  above  we  find 
that  there  exists  Kq  >  0,  Ki  >  0,  and  K2  >  0  such  that: 

\Dxf{r{xi,pi),pi)  -  Dxf{r{xi,pi),Po)\ 

<  Kq\pi  -  po\  <  K0C2CXX0  (1^-47) 

\Dxf{r{xi,Pi),Po)  -  Dxf{r{xo,Po),Po)\ 

<  Ki\f{xi,pi)  -  r{xo,Po)\  <  KiC2axo  (D.48) 

\Dxf{r{xo,Po),Po)\ 

<  K2\r{xo,po)  -c\<  K2OLX0  (D.49) 
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for  all  0  <  i  <  n. 


From  (D.47)  and  (D.48)  we  have: 

\D^f{r{xi,pi),pi)  -  D^f{rixo,po),po)\ 

<  \Da:f{r{xuPi),Pi)  -  Da:f{rixi,pi);po)\ 

+  \Dxf{r(xupi),po)  -  D^f(r{xo,po),Po)\ 

<  KoC2C^,o  +  KiC2a,o  =  C2{Ko  +  Kr)a,,  (D.50) 


for  all  0  <  i  <  n. 

Now  set  C2  =  niin{i,  —  /?)}•  Then  from  (D.50)  and  (D.49): 


l^xf(f(xi,p),pi)l  ^ 

II)xf(f(xo,Po),Po)l 

> 

> 


^xf(f(xi,pi),pi)  -  Bxf(f(xo,Po),Po) 
l^xf(f^(xo,Po),Po)l 
C2{Kq  +  K\)axQ 
K2OCX0 


l-( 


K2 

Ko  +  Ki 


Ko  +  Kr^ 

K2  ^  ^ 


for  all  0  <  i  <  n.  Thus  we  have: 


\DxP{xi,pi)\  ^  \Dxf{P{xuPi),pi)\ 
\DxP(xo,po)\  /=o  \Dxf{P{xo,po),Po)\ 


if  0  <  i  <  n,  which  proves  the  lemma. 


Lemma  D.3.6  Let  {fp  :  >  Ix\p  €  Ip}  be  a  one-parameter  family  of  mappings  sat¬ 

isfying  (BO)  and  (Bl).  Suppose  that  po  €  int{Ip),  and  fp^  satisfies  (CEl).  Let  xq  €  Ix 
such  that  n  =  N{xQ,fpa)  <  00  and  mino<t<n  \r{xo,po)  —  c\  =  a^p  >  0.  Then  there  exist 
C3  >  0  and  C4  >  0  (independent  of  xq)  such  that 

\h[xo,po)iP)\  < 

aro 

tfp  €  V{xo,po)  whereV{xo,po)  =  [po,Po+^Pi],  hpi  =  CaocI^,  and  h(^xo,po)  •  Vixo^Po)  4 
is  a  function  satisfying  h(xo,po){Po)  =  xq  and  f'^{h^:ao,po){p)iP)  =  c  for  all p  €  V{xo,po). 


From  lemma  D.3.1  we  know  that  there  exists  a  function  ^(a;o,po)(p)  such  that 
h^o,Po){Po)  =  xo  and  /"(/i(^o,Po)(p):P)  =  c  if  p  €  J{xo,po)  where  J(xo,po)  C  4  is  a 
interval  containing  po-  Also  from  lemma  D.3.1  we  know  that  there  exists  a  continuous 
function  g(xo,po){p)  satisfying  g(xo,po)iPo)  =  xo  and  P'{9{xo,po){p),p)  =  c  for  all  p  e  Ip. 
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(D.51) 


By  lemma  D.2.11,  there  exists  ^7  >  0  and  A  >  0  such  that; 

D,f{xo,po)  >  Ca^y- 


for  any  0  <  z  <  n. 

Now  fix  Ai  =  >  1  and  let  ^  ^  <  1.  Then  given  g(a^o,Po){p)^  we  know  from 

lemma  D.3.5  that  there  exists  a  constant  Q  <  C2  <  \  (dependent  only  on  such  that 
if  V{xo,Po)  C  Ip  is  the  maximal  interval  satisfying  the  following  conditions: 

(1)  If  p  €  y(xo,po),  then  \p  -  po\  <  C20Cxo- 

(2)  If  p  €  y(a:o,Po),  then  |f  (p(.o.po)(p).p)  “  f  (^o,Po)l  <  <^20*0  for  0  <  i  <  n, 


then  p  e  y(xo,Po)  implies  that: 

\Dxf'ig(  xo,po  )(p)>p)l  >  Qi  m  52) 

\DJKxo.Po)\  - 

iox  any  0  <  z  <  n.  Note  that  by  setting  Ai  >  0,  we  have  also  set  the  constants  0  <  /?  <  1 
and  0  <  C2  <  |,  so  these  constants  are  fixed  for  the  discussion  that  follows. 

Note,  also,  that  from  condition  (2)  above  it  is  apparent  that  P(a;o,po)  7^  ^  for  any  p  6 
V{xo,po)-  From  lemma  D. 3.1,  this  implies  that  ^^(xcPo)  C  J(xo,po)  so  that  g(xo,po){p)  — 
h=^o,Po)iP)  is  when  p  €  y(xo,po). 

Now  consider  the  sequence  {p_t}”_o  where  z/-,-  =  /”  ’(xqjPo)  so  that  y-n  =  and 
yo  =  c.  Then,  from  (D.51),  (D.52),  and  our  choice  of  we  know  that: 

\Dxf{hy_„pM,p)\  >  \Dj\y-i,poW  > 

if  p  e  V{y-i,po)  for  any  0  <  z  <  n.  Substituting  this  into  lemma  D.3.3  we  find  that  if 
p  e  y(a:o,Po)  : 


^  |I>^/(z(p),p),p)|  \DJ^{h(y_,,p^){p),p)\ 


(D.53) 


Where  z{p)  =  /"-'(^(xo.Polb).?)-  Since  fp,  satisfies  (CE2)  and  fiz{p),p)  =  c,  we  can 
bound  |D/(z(po),Po)l  >  Ke  for  some  constant  Ke  >  0  independent  of  xq.  Consequently 
from  condition  (2)  above  and  lemma  D.2.1  there  must  exist  K'^  >  0  (independent  of  xq) 
such  that  \Df{z{p),po)\  >  if  p  €  y(xo,Po).  Substituting  this  into  (D.53)  we  have; 

<  _ )( _ I - ) 
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(D.54) 


Thus  setting  Cz  =  ^-lo  we  have  that 

^Xo 

for  0  <  i  <  n  if  p  G  V{xo,pq).  Of  course,  since  xq  =  y-n,  this  also  implies  that 

|Ak,„)(p)l  < 

if  p  G  V{xo,po). 

This  places  the  proper  bound  on  the  derivative  ^(xo,po)(^)‘  ^ 

proper  bound  on  the  size  of  V{xo,po).  Set 


(j 

Sp  =  mm{^al^,C2a,:^,sup{Ip)  -  po}- 


(D.55) 


We  claim  that  if  [po,Po  +  ^p]  C  y(p_(,_i),po),  then  [po,po  +  M  ^  V{y-i^Po)-  Also, 
it  is  clear  that  [po,po  +  ^p]  C  V{c,po)  =  V^(po,Po)-  So,  by  induction  on  i,  this  claim 
implies  that  [po,Po  +  ^p]  C  V{y-n,Po)  =  V(xo,po)^  Thus  if  the  claim  is  true,  then 
from  (D.55),  and  since  is  bounded  above,  we  know  there  exists  C4  >  0  such  that 
\po,Po  +  ^Pi]  C  V{xo,po)  where  Spi  =  C'4a|g.  This  proves  the  lemma.  Thus,  all  that  is 
left  to  do  is  to  prove  the  claim. 

Suppose  that  the  claim  were  not  true.  This  means  there  exists  pi  G  [po^Po  +  ^p] 
such  that  Pi  ^  y(p_j,po).  From  our  specification  of  V{xo,po)  and  the  intermediate 
value  theorem,  it  is  apparent  that  the  only  way  this  can  happen  is  if  there  exists  some 
P2  G  [po^Pi]  such  that 

I^(3/-.,po)(P2)  -  y-i\  =  C2oc^o  (D.56) 

and  [po,P2]  C  y(y_j,po). 

However,  by  the  mean  value  theorem,  we  know  that 

l^(y-i,po)(P2)  ■“  y-i\  —  l^(y-<,po)(P2)  ~  ^(y-i,po)(Po)| 

=  W(y-i,po){PMP2  -  Po\  (D.57) 

for  some  pz  G  [po,P2]  C  y(?/_(,_i),po).  But  from  (D.54): 

IV,«)(P3)I  <  Cs^  (D.58) 

XO 

Combining  (D.57),  (D.58),  and  our  choice  of  Sp  we  find  that 

1 

l%-.,Po)(P2)-P-i|  <  C'3^|P2-P0| 

^xo 

<  Cz^Sp 


(D.57) 


(D.58) 


^  2  ^2  <^10 
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which  contradicts  (D.56)  and  proves  the  claim. 

Lemma  D.3.7  Let  {fp  :  4  4b  €  4}  he  a  one-parameter  family  of  mappings  sat¬ 
isfying  (BO)  and  (Bl).  Given  any  po  G  int[Ip),  xq  G  4?  Pi  €  int[Ip),,  and  x\  G  4? 
suppose  that  W (xq)  C  4  x  Ip>  Is  a  connected  set  that  can  he  represented  in  the  following 

way: 

e  [0,1]} 

where  :  [0, 1]  ->  4  and  ;  [0, 1]  Ip  satisfy  the  following  properties: 

1.  a^oit)  and  l3xo{t)  are  continuous. 

2.  l3xo{t)  is  monotonically  increasing  with  respect  to  t. 

3.  O'2:o(0)  =  Xq,  Q:io(1)  =  ^1- 

4-  /5ro(0)  =  Po,  =Pl- 

Then  there  exists  constants  Sp  >  0  and  C  >0  (independent  of  xq)  such  that  if  \xi-xo\  > 
C\pi  -po|*  and  \pi  —  po\  <  ^P,  then 

W^(xo)  n  R{n,y,po)  fl  (4  x  bo,Po  +  ^p])  /  0 

for  some  n  >  0  and  y  Q.  Ix  such  that  y  ^  a^o- 

Proof:  We  assume  that  xi  >  xq  and  pi  >  po  (the  other  cases  are  similar).  From 
lemma  D.2.13,  we  know  that  there  exist  constants  Kq  >  0  and  eo  >  0  so  that  for  any 
positive  e  <  eo,  there  is  a  j/  G  (a:o,  xo  +  e)  such  that  /”(j/,Po)  =  a  and  mino<t<n  /  (y,Po)  > 
KqC  for  some  n  >  0.  From  lemma  D.3.6,  we  know  that  there  exist  constants  >  0  and 
K2  >  0  such  that  if 

6p,  =  KiiKoef  (D.59) 

then  for  all  p  G  bo,Po  +  hpt]  : 

|AU,(p)l  <  (D-“) 


Thus  given  xq  G  4,  G  4,  Po  €  int{Ip),  and  pi  G  int{Ip)  choose 

Ko^  ^ 


(D.61) 
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Also,  set  Sp  =  Ki(Koeo)^.  Note  that  this  means  pi—po  <  implies  that  e  <  eo,  so  that 
the  results  of  the  previous  paragraph  hold. 

In  particular,  if  we  substitute  (D.61)  into  (D.59),  we  find  that  =  Ki{KoeY  = 
Pi  —  Po  so  that  from  (D.60)  we  have  that  for  all  p  €  [po,Pi]  • 

for  some  y  G  (xo,a:o  +  e).  Consequently: 

hy,Po)iPi )  <  hy,po)iPo)  +  (Pi  -  Po)  I h[y,p,)ip) \ 

<  2/  +  ^2(^f(Pl  -Po) 

<  (xo  +  e)  +  )^(pi  —  Po) 

I  +  K0K1K2  ^ 

=  a:o  + - 1 - (pi  -  Po)  ® 

K^^Ko 

=  xo  +  C(pi-po)*  (D.62) 

where  C  = 

KfKo 

Now  suppose  that  (xi,pi)  €  VF(xo)  where  Xi— xo  >  C\p\—po\^.  From  (D.62)  we  know 
that  there  exists  a  continuous  function,  h(j,^pg)(p)  such  that  (^(y,po)(p)»p)  G  R{n,y,po) 
for  all  p  €  [pojPi]  where  h(y^p^){po)  =  y  >  xq  and  h(p,po)(Pi)  <  ^i-  We  are  also  given  that 
tF(xo)  can  be  represented  as  W{xq)  =  {(^^^(t),  y3a:o(<))|i  G  [0, 1]}.  Using  the  Intermediate 
Value  Theorem,  it  can  be  shown  that  h^y^pg){^{tl))  =  ax^ih)  for  some  U  G  [0, 1].  This 
implies  that 

lU(xo)  n  R{n,y,po)  fl  (4  x  [po,Po  +  dp])  ^  0  (D.63) 

which  proves  the  lemma. 


Proof  of  Theorem  3.3.2:  Note  that  the  theorem  is  trivial  if  Xo  =  Bd{Ix)  (where  Bd{Ix) 
denotes  the  boundary  of  Ix).  Otherwise,  fix  po  G  int{Ip)  such  that  fp^  satisfies  (CEl) 
and  suppose  there  exists  dpi  >  0  such  that  D{fp,t)  is  monotonically  decreasing  for 
P  €  [po,Po  +  dpij.  Given  any  xq  G  int{Ix)  let: 

^,7(2:0)  =  <  n  and  x  <  Xq) 

X+(i„)  =  {x\N(x,U  <  n  and  x  >  xq} 

Define  the  following  functions  a~,^^  :  Ip  Ix  and  :  Ip  ^  Ix  : 

<,xoiP)  =  {x\{x,p)€  R{n,x',po),pe  Ip}  (D.64) 


x'€Xn  (a;o) 

o-txoip)  =  {a:l(a:,p)  G  E(n,x',po),p  G /p} 

x'ex^(xo) 


(D.64) 

(D.65) 
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It  is  apparent  from  our  specification  of  R{n,x,po)  that  and  a+^(p)  must  be 

continuous  with  respect  to  p. 

First  of  all  note  that  >  a-^^{p)  if  m  >  n.  Furthermore,  we  claim  that  for 

any  n  >  0  there  exists  m  >  k  such  that  a-^^{p)  >  for  all  p  G  \po,Po  +  ^Pi]-  By 

lemma  D.3.2  we  know  that  if  D{t,fp)  is  monotonically  decreasing  for  p  G  [po,Po  +  ^Pi] 
then  R{n,x,po)  and  i?(n,a;',po)  do  no  intersect  in  the  region  4  x  \po,Po  +  Spi]  provided 
X  ^  x'.  This  is  implies  that  we  can  rewrite  (D.64)  as: 

«n,*o(P)  sup{x|(a:,p)  G  R{n,xl,po)}  (D.66) 

where  x*  =  sup{X“(xo)}-  Also  we  know  from  lemma  D.2.9  that  given  any  n  >  0 
there  exFsts  some  m>  n  such  that  x^^  >  x*.  This  proves  the  claim.  Similarly,  we  also 
can  show  that  for  any  n  >  0  there  exists  m  >  n  such  that  ®m,xo  (p)  <  <xo(p) 
p  e  [po,po  + 

Returning  to  the  lemma,  we  note  that  since  is  monotonically  increasing  with 

respect  to  n,  and  bounded  above  by  sup  4  =  1,  there  exists  a  function,  c~  (p),  such  that 


the  limit 

(D.67) 

converges  pointwise.  Now  set 

Koip)  =  limsupa^„(t) 

t->p 

(D.68) 

and  define 

S~{xo)  =  {(x,p)|liminf6~  (t)  <  x  <  limsup6“  (t)}. 

(D.69) 

Similarly  we 

can  also  define  5'^(xo)  as  follows: 

<(p)  = 

Koip)  = 

•^^(Xo)  =  {(a:,p)|liminffe+(t)<x<limsup6+(t)}. 

t— ♦•p  t— ►p 

The  next  step  is  to  show  that 

S~{xo)  n  R{n,x,p)  n  (4  X  [po,Po  +  ^Pi])  =  0  (D.70) 

for  any  x  ^  xq  and  any  n  >  0.  This  will  be  done  in  two  parts.  First  we  address  the  case 
where  x  <  xq.  We  claim  that  (D.70)  is  true  if  x  <  xq.  Suppose  the  claim  is  noUrue.  Then 
from  (D.64)  there  must  exist  some  (x',p')  G  5'"(xo)  and  n  >  0  such  that  (p')  > 


172 


where  p'  6  [po^P  +  ^Pi]-  But  we  have  already  seen  that  for  any  n  >  0  there  exists  an 
m>n  such  that  a',^,(p)  >  a-^^{p)  for  all  p  €  [po,Po  +  Spi].  Thus  a"  (p)  >  a-^^(p)  for 
any  n  >  0  if  p  €  [pojPo  +  ^Pij-  Consequently  since  a“^(p)  is  continuous: 


<  an,=.o(p')  =  liminflimsupa„_^^(t) 

ti~*P  t— (-ii 

<  lim  inf  lim  sup  a J  (t)  =  lim  inf  b~  {t) 

*i->p  t-fti 


which  from  (D.69)  implies  that  {x',p')  ^  S  (a:o).  This  is  a  contradiction  which  proves 
the  claim. 

We  now  claim  that  .S'"  (a;o)  H  i2(n,  x,  p)  fl  (/^  x  [po,Po  +  ^Pi])  =  0  if  x  >  xq.  If  this  claim 
is  not  true,  then  from  (D.65)  we  can  see  that  there  must  exist  some  (x',p')  €  S~{xo) 
and  n  >  0  such  that  a^.^^{p')  <  x'  where  p'  €  [po>Po  +  ^Pi]-  Furthermore  there  exists 
m  >  n  such  that  0^,10 (p)  ^t,xoiP)  P  ^  [POjPo  +  <^Pi]-  Thus  there  exists  e  >  0  such 
that  a,m,xoiP')  —  x'  —  2t.  Since  am,xo(p)  continuous,  this  implies  that  there  exists  ^  >  0 
such  that 


<,„(?)<*'-«■  (D-71) 

for  any  p  such  that  \p  —  p'\  <  6.  But  since  (x',p')  €  ^“(xo), 

lim  sup  lim  sup  lim  >  x'. 

Since  a~^^{p),  is  continuous,  this  implies  that  for  any  ^  >  0  and  e  >  0  there  is  an  re  >  0 
and  Pi  with  |pi  —  p'|  <  d  such  that  aJ^^j;o(pi)  >  x'  —  t.  Combining  this  with  (D.71)  we  see 
that  there  exists  p2  such  that  a~x^{p2)  —  impossible  by  lemma  D.3.2 

because  it  implies  that  (x',p')  €  i?(m,  xi,po)  and  {x',p')  £  f?(re,  X2,po)  for  some  re  >  0, 
m  >  0,  Xi  X2,  and  p'  £  [po,Po  +  ^Pi]-  This  contradiction  proves  the  claim. 

The  next  step  is  to  show  that  5'“(xo)  U  5'*‘(xo)  is  invariant  under  /.  We  claim  that 
if  (x,p)  £  ^“(xo)  then  either  (/(x,p),p)  £  5“(/(xo,po))  or  (/(x,p),p)  £  5+(/(xo,po))- 
For  any  xo  £  int{Ix),  there  exists  an  e  >  0  such  that  (xq  —  e,  xq)  C  {Ix  \  {c}).  Let 
J  =  (xo  —  e,  Xo).  Then,  since  fp„  is  a  diffeomorphism  on  J,  for  any  pi  £  /( J,  po)  such  that 
re(pi)  =  N{yi,fpo)  <  00,  there  exists  po  €  J  such  that  pi  =  /(po,Po)  and  A''(po,/po)  = 
n(pi)  +  1.  Consequently,  from  (D.66)  we  know  that  there  exists  >  0  such  that  for  all 
n  >  N  : 


/(«n,xo(p)5P) 


“n,/(xo,po)(^')  if  L>^/(a:,Po)  >  0  on  J 
^tnxo,po)iP)  Dxf{x,po)  <  0  on  J 


for  any  p  £  [po,Po  +  ^Pi]  if  x  £  int{Ix).  This  result  combined  with  our  specification  of 
^"(xo)  in  (D.67),  (D.68),  and  (D.69)  proves  the  claim.  Using  the  analogous  result  for 
5''*‘(xo)  gives  us  that  5“(xo)  U  *S'‘^(xo)  is  invariant  under  /. 
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Finally,  from  the  formulation  of  S  (xq)  in  (D.69),  it  is  apparent  that  there  exists  a 
VF"(xo)  C  S~{xo)  such  that  W~{xo)  can  be  represented  in  the  following  way; 

IF-(xo)  =  e  [0,1]} 

where  :  [0,1]  4  and  :  [0,1]  4  are  continuous  functions  and  is 

monotonically  increasing  with  respect  to  t  with  /9a;o(0)  =  Po  aiid  ^xo{^)  =  Po  +  ^Pi-  Of 
course,  a  similar  VF'^(xo)  C  S'^{xq)  also  exists. 

Putting  it  all  together,  we  have  now  shown  that:  (1)  S~{xo)  U  S'+(xo)  is  invariant 
under  /  and  (2)  (^-(xo)  U  5+(xo))  n  R{n,x,po)  n  (7^  x  [po,Po  _+ <5pi])_=  0  for  any  n  >  0 
and  any  x  xji  xq.  From  property  (2)  above,  lemma  D.3.7,  and  since  W  (xo)  C  S  (xo),  it 
is  apparent  that  there  exists  8p2  >  0  and  (7  >  0  (independent  of  xo)  such  that  if  (x,p)  € 
W"(xo)  then  |x-xo|  <  C'(p-po)^-  Set  Sp  =  min{^pi,5p2}  and  let  hF(xo)  =  W~{xo)  for 
p  e  \po,Po  +  8p].  Then  property  (1)  implies  that  given  any  Xo  G  int{Ix),  if  (x,p)  €  VF(xo) 
and  p  G  \po,Po  +  8p],  then  ]/” (x,p)  - /"(xo,Po)|  <  C{p-po)^  for  any  n  >  0.  This  proves 
the  theorem. 
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Appendix  E 

Proof  of  theorem  3.4.2 


This  appendix  contains  the  proof  for  theorem  3.4.2.  For  reference,  the  conditions,  (CEl) 
and  (CE2),  can  be  found  in  the  beginning  of  appendix  D. 

Theorem  3.4.2  Let  Ip  =  [0,4],  Ix  =  [0, 1],  and  fp  :  Ix  Ix  be  the  family  of  quadratic 
maps  such  that  fp{x)  =  px{\  —  x)  for  p  €  Ip-  Then  there  exist  constants  ^  >  0,  (7  >  0, 
>  0,  and  set  E{^)  C  Ip  with  positive  Lebesgue  measure  for  every  7  >  1  such  that: 

(1)  If-y  >  1  and  po  £  -£(7),  then  fpo  satisfies  (CEl). 

(2)  If  fpo  satisfies  (CEl),  then  for  any  e  >  0  sufficiently  small,  any  orbit  of  fpo  can  be 
e— shadowed  by  an  orbit  of  fp  for  p  £  \po,po  +  C^]. 

(3)  If  j  >  1  and  po  £  £'(7),  then  for  any  e  >  0,  almost  no  orbits  of  fp^  can  be 
t— shadowed  by  any  orbit  of  fp  for  p  €  {po  —  6,po  —  {Key). 

That  is,  the  set  of  possible  initial  conditions,  xq  £  K,  such  that  the  orbit  {fpa{xo)}^Q 
can  be  e— shadowed  by  some  orbit  of  fp  comprises  at  most  a  set  of  Lebesgue  measure 
zero  on  4  if  p  G  (po  -  £,Po  —  {Ke)y. 

Proof  of  Theorem  3.4-2:  We  first  address  parts  (1)  and  (3)  of  theorem  and  come  back 
to  part  (2)  at  the  end  of  the  proof. 

The  basic  idea  behind  parts  (1)  and  (3)  is  to  apply  theorem  3.3.1  to  theorem  3.4.1. 
There  are  four  major  steps.  We  first  set  lower  bounds  on  the  return  time  of  the  orbit  of 
the  turning  point,  c  =  |,  to  neighborhoods  of  c.  Next  we  show  that  fp  satisfies  (CPI) 
and  favors  higher  parameters  on  a  positive  meeisure  of  parameter  values.  This  allows  us 
to  apply  theorem  3.3.1.  Finally  we  show  that  almost  every  orbit  of  these  maps  approach 
arbitrarily  close  to  c  so  that  if  the  orbit,  {/p(,(c)}^oj  cannot  be  shadowed  then  almost 
all  other  orbits  of  fp^  cannot  be  shadowed  either. 
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We  first  show  that  there  is  a  set  of  parameters  of  positive  measure  such  that  orbits 
of  the  turning  point,  {/*(c)},^05  ^^t  return  too  quickly  to  neighborhoods  of  c.  This 

can  be  seen  from  the  construction  used  to  prove  theorem  3.4.1.  In  [5]  it  is  shown  that 
for  any  a  >  0,  if  S{a)  C  Ip,  is  the  set  of  parameters  such  that  fp^  satisfies  both  (CEl) 


and: 

for  all  i  €  {0, 1,2, . . .  },  then  S{a)  has  a  density  point  at  p  =  4. 

We  now  show  that  (CPI)  is  also  satisfied  on  a  positive  measure  of  parameter  values. 
First  consider  what  happens  if  p  =  4  ; 

T>p/(c,p  =  4)  =  ^  (E-2) 

Z)p/(/"'(c,p  =  4),p  =  4)  =  0foranyn>l  (E.3) 

\D^f{r{c,p  =  4:),p  =  4:)\  =  4foranyn>l  (E.4) 

\DxP{c,p  =  4)1  =  4""^  for  any  n  >  1.  (E.5) 


It  also  a  simple  matter  to  verify  that  fp  favors  higher  parameters  at  p  =  4.  Note  that 
from  the  chain  rule  we  have  that: 


Dpr{c,p)  =  Dj{r-\c,p),p)Dpr-\c,p)  +  Dp/ir  \c,p),p)  (e.g) 

for  any  n  >  1  and  any  p  €  Ip.  Consequently,  using  continuity  arguments  we  can  see  that 
for  any  N  >  0  and  5  >  0  there  exists  ei  >  0  such  that  p  G  [4  —  ei ,  4]  implies  that  both 
of  the  following  hold: 


\Dp{c:P)\  > 
\Dp{r{c,p))\  < 


4 

8  for  any  n  G  {2, 3, . . . ,  N]. 


(E.7) 

(E.8) 


From  (E.6)  we  can  see  that: 

Dpr{x,p)  =  T>p/(r"'(c,p),p)  +  EPp/(/‘(‘'>P)’P)  n  DJ{f{c,p),p)] 

»=o  j=»+i 


n— 1 


Dpf{r{c,p),p) 


E 


mziDj{p{c,p),p) 

] 

Dpf{f{c,P),P) 


n;=i  Dj{p{c,p),py 
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for  any  n  >  \.  But  from  theorem  3.4.1,  we  also  know  that  there  exists  Ke  >  0  and 
\e  >  I  and  a  set  C  Ip  of  positive  measure  such  that  if  p  G  E,  then  (CEl)  is  satisfied 
for  fp  : 


I  n  DJif(c,p),p)\  =  \D,r{f{c.p),p)\  >  KeXI: 
i=i 

Substituting  this  into  (E.9)  we  have: 


\Dpf  (XjPjl  >  Ke^e  W^pfyp'iPn  \i  i 

i=l  ^e^e 


Substituting  (E.7)  and  (E.8): 


for  any  n  >  1.  Now  if  if  we  set 


1  6  ^“(^+1) 

=  _  Ae’)  ”  iKetl  -  Ai')^ 


we  see  that  (7^  >  0  if  5  >  0  is  sufficiently  small  and  iV  >  0  is  sufficiently  large.  From 
(E.7)  and  (E.8)  we  know  that  we  have  full  control  of  ^  >  0  and  >  0  with  our  choice 
of  Ci.  So  choose  ci  >  0  small  enough  so  that  Ce  >  0  for  any  p  G  [4  —  ei,4].  Then  we 
have  that: 


\Dpr{x,p)\  >  KeCeXI-^  (E.IO) 

for  all  n  >  1  if  p  G  [4  —  ei,4]  and  fp  satisfies  (CEl)  (ie,  |jD2:/"(/(c,p),p)l  >  KeX^  for 
all  n  >  1).  Looking  at  (E.6),  it  is  also  apparent  that  if  (E.IO)  is  satisfied,  then  since 
\Dpf{f'^~'^{c,p),p)\  <  the  sign  of  Dpf^{x,p)  is  governed  by  the  signs  of  T>x/(/”"Hc,p),p) 
and  Dpf'^~^{c,p)  for  n  >  1  sufficiently  large.  Thus,  since  fp  favors  higher  parameters 
at  p  =  4,  there  exists  some  e  >  0  with  e  <  ei  such  that  fp  favors  higher  parameters  if 
p  G  [4  —  e,4]  and  fp  satisfies  (CEl). 

Consequently,  (CPI)  must  be  satisfied  and  fp^  favors  higher  parameters  for  any 
Po  €  [4  —  e,4]  such  that  fp^  satisfies  (CEl).  But  recall  that  for  any  a  >  0,  5'(a) 
has  a  density  point  at  p  =  4  and  po  G  S{a)  implies  that  /pg  satisfies  (CEl).  So  let 
5. (a)  =  5(0:)  n  [4  —  e,4].  Then  for  any  a  >  0  we  can  see  that  if  po  G  5(0),  then 
condition  (E.l)  is  satisfied,  /pg  satisfies  (CEl),  and  fp  satisfies  (CPI)  and  favors  higher 
parameters  at  p  =  po.  Furthermore,  S'* (a)  has  a  density  point  at  p  =  4. 
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Now  recall  from  section  3.3.1  that  ne(c,  e,  po)  is  defined  to  be  the  snaallest  integer 
n  >  1  such  that  l/"(c,po)  -  cj  <  e.  Thus,  if  (E.l)  is  satisfied,  then 


ne(c,e,po)  > 


(E.ll) 


But  from  theorem  3.3.1,  we  know  that  if  fp^  satisfies  (CEl)  and  fp  satisfies  (CPI)  and 
favors  higher  parameters  at  p  =  po  €  Ip,  then  there  exist  constants  ^  >  0,  Kq  >  0,  Ki  >  0 
and  A  >  1  such  that  there  are  no  orbits  of  fp  which  e— shadow  the  orbit,  {/po(c)}t=05  i^ 
p  e  (po  -  S,po  -  Substituting  in  the  condition  (E.ll)  we  find  that: 

(E.12) 


Now  suppose  we  are  given  any  7  >  1.  We  can  see  that  if  a  <  log  A  then 

l  +  ilogA>7.  (E.13) 

a 

So  let  Eij)  =  log  A).  Note  that  has  positive  Lebesgue  measure  and  a  density 

point  at  p  =  4.  For  any  7  >  1,  we  also  see  that  if  po  €  E{'y)  then  fp  satisfies  (CPI) 
and  (CEl)  at  p  =  po-  Thus  by  theorem  3.3.1  and  from  (E.12)  and  (E.13)  we  have 
that  if  Po  €  E{'y)  then  no  orbits  of  fp  e— shadow  the  orbit,  {/p(,(c)}^05  P  ^ 

(po  -S,po-  Ko{Kiey).  But  since  7  >  1,  if  we  set  constant  K  =  m&x{KoKi,Ki}  >  0  we 
see  that  po  -  Ko{Kie)'^  >  Po  -  (Ke)'^  for  any  e  >  0.  Thus,  no  orbits  of  fp  may  e-shadow 
{4o(^)}^o^  if  P  €  (po  -S,po-  {Ke}'^). 

The  final  step  is  to  show  that  almost  any  orbit  of  fp  comes  arbitrarily  close  to  c.  This 
can  be  seen  from  the  following  two  lemmas: 

Lemma  E.0.8  Let  U  be  a  neighborhood  of  c.  For  any  p  G  Ip,  if  Eu  =  {x\  f;{x)  e 
I  \  U  for  all  n  >  0}  contains  no  non-trivial  intervals,  then  the  Lebesgue  measure  of  Eu 

is  zero. 

Proof  of  lemma  E.0.8:  See  Theorem  3.1  in  Gukkenheimer  [26]. 

Lemma  E.0.9  If  po  G  Ip  and  fp^  satisfies  (CEl),  then  the  set  of  preimages  of  c,  Cp  = 
Ui>of~^{c),  is  dense  on  h. 

Proof  of  lemma  E.0.9:  See  corollary  II. 5.5  in  Collet  and  Eckmann  [14]. 

From  these  two  lemmas  we  can  see  that  for  almost  all  xq  G  Ip,  the  orbit,  {/pg(xo)}^05 
approaches  arbitrarily  close  to  c  if  p  G  £^(7),  for  any  7  >  1.  Thus  for  almost  all  Xo  G 
Ip,  there  are  arbitrarily  long  stretches  of  iterates  where  the  orbit,  {/po(a:o)}£o> 
arbitrarily  close  to  the  orbit,  {/pi3(c)}^o-  This  means  that  if  there  are  no  orbits  of  /p 
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that  can  shadow  {fpg{c)}^Q,  there  can  be  no  orbits  of  fp  that  can  shadow  {/p(,(a;o)},^o- 
Consequently  for  any  7  >  1  if  po  €  £^(7)  then  fp^  satisfies  (CEl)  and  almost  no  orbits 
of  fpg  can  be  shadowed  by  any  orbit  of  fp  if  p  e  {po  —  ^,Po  —  (Ke)'^).  This  proves  parts 
(1)  and  (3)  of  theorem  3.4.2. 

Part  (2)  of  theorem  3.4.2  is  a  direct  result  of  Corollary  3.3.1,  Theorem  3.4.1,  and  the 
following  result,  due  to  Milnor  and  Thurston: 

Lemma  E.0.10  The  kneading  invariant,  D{fp,t),  is  monotonically  decreasing  with  re¬ 
spect  to  p  for  all  p  €  Ip- 

Proof  of  lemma  E.0.10:  See  theorem  13.1  in  [34]. 


Thus  if  Po  €  £(7)  satisfies  (CEl),  there  exists  constant  C  >  0  such  that  if  po  €  £(7) 
then  any  orbit  of  fp^  can  be  e— shadowed  by  an  orbit  of  /p  if  p  €  [po,Po  +  Ce\  This  is 
exactly  part  (2)  of  the  theorem. 

This  concludes  the  proof  of  theorem  3.4.2. 
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